0

My problem is very similar to the one in: python - scatter plot with dates and 3rd variable as color

But I want the colors to vary acording to 3 set of values inside my 3rd variable.

for example:

#my 3rd variable consists of a column with these planet radii values:

    radii
1    70
2     6
3    54
4     3
5    0.3
...

And I expect to vary the colors according to radii>8, 4< radii<8 and radii<4.

I've tried using the simple code, presented in the other question:

db=table_with_3_columns()
x=db['column a']
y=db['column b']
z=db['radii']
plt.scatter(x,y,c=z,s=30)

But I don't know how to specify the 'c' parameter for different sets inside z. I've also tried using:

a=[]
for i in db['radii']
    if i>8:
       a['bigradii']=i
    elif i<4:
       a['smallradii']=i
    elif i<8 and i>4:
       a['mediumradii']=i
    return a

but I don't know how to proceed with that.

The result would be a scatter with the dots separated by colors guided by the values in the 3rd column 'radii', but all I get using the first code is all the dots black, or, by using the second code it tells me that i is a string, and I cannot put that on a list :(

How can I achieve that?

  • you should give a shot to Seaborn's scatterplot, I believe it would fit your need. https://seaborn.pydata.org/generated/seaborn.scatterplot.html, it allows to conveniently manage a scatter plot in size (size), color (hue) and marker shape (style) – LoneWanderer Oct 27 '19 at 23:10
  • The most succinct option is to create a new column with `pd.cut` and then plot the color based on the new value. – Trenton McKinney Sep 29 '21 at 15:21

1 Answers1

0

I think what you should do is:

  1. create an empty list which later will be passed to 'c' in the scatter function.
  2. iterate over your data and do a 'switch like' sequence of if statements to add 1,2 or 3 to the list, according to the discretization you mention. These numbes will represent the different indexes in the cmap palette (which means different colors)

Here is an example of what I mean:

import numpy as np
import matplotlib.pyplot as plt

# x and y will have 100 random points in the range of [0,1]
x = np.random.rand(100)
y = np.random.rand(100)
# z will have 100 numbers, in order from 1 to 100
# z represents your third variable
z = np.arange(100)

colors = []

# add 1 or 2 to colors according to the z value
for i in z:
  if i > 50:
    colors.append(2)
  else:
    colors.append(1)

# half the points will be painted with one color and the other half with another one

plt.scatter(x, y, c=colors,)
plt.show()

enter image description here

onofricamila
  • 930
  • 1
  • 11
  • 20