12

I have two numpy arrays, x and y, with 7000 elements each. I want to make a scatter plot of them giving each point a different color depending on these conditions:

-BLACK if x[i]<10.

-RED if x[i]>=10 and y[i]<=-0.5

-BLUE if x[i]>=10 and y[i]>-0.5 

I tried creating a list of the same length as the data with the color I want to assign to each point and then plot the data with a loop, but it takes me a long time to run it. Here's my code:

import numpy as np
import matplotlib.pyplot as plt

#color list with same length as the data
col=[]
for i in range(0,len(x)):
    if x[i]<10:
        col.append('k') 
    elif x[i]>=10 and y[i]<=-0.5:
        col.append('r') 
    else:
        col.append('b') 

#scatter plot
for i in range(len(x)):
    plt.scatter(x[i],y[i],c=col[i],s=5, linewidth=0)

#add horizontal line and invert y-axis
plt.gca().invert_yaxis()
plt.axhline(y=-0.5,linewidth=2,c='k')

Before that, I tried creating the same color list in the same way, but plotting the data without the loop:

#scatter plot
plt.scatter(x,y,c=col,s=5, linewidth=0)

Even though this plots the data much, much faster than using the for loop, some of the scattered points appear with a wrong color. Why not using a loop to plot the data leads to incorrect color of some points?

I also tried defining three sets of data, one for each color, and adding them to the plot separately. But this is not what I am looking for.

Is there a way to specify in the scatter plots arguments the list of colors I want to use for each point in order not to use the for loop?

PS: This is the plot I get when I don't use the for loop (wrong one):

enter image description here

And this one when I use the for loop (correct):

enter image description here

Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
Argumanez
  • 123
  • 1
  • 1
  • 7
  • if you put list as value of c, plot will have different colors depending on that list, have you tried that? – Ada Borowa Nov 25 '16 at 11:08
  • The problem is that when I do that after creating the list I get points with wrong colors in the plot. But when I use the loop everything is Ok. – Argumanez Nov 25 '16 at 11:11
  • How do you know you get wrong colors in the plot? Can you add some images which illustrate this? It would also be helpful if you can create a [Minimal, Complete, and Verifiable Example](http://stackoverflow.com/help/mcve). – DavidG Nov 25 '16 at 11:18
  • You're wrigth, I've just added the images. I know I get wrong colors in the plot because of the situation of the point in it. Each region of the plot corresponds to a color. – Argumanez Nov 25 '16 at 11:33

1 Answers1

39

This can be done using numpy.where. Since I do not your exact x and y values I will have to use some fake data:

import numpy as np
import matplotlib.pyplot as plt

#generate some fake data
x = np.random.random(10000)*10
y = np.random.random(10000)*10

col = np.where(x<1,'k',np.where(y<5,'b','r'))

plt.scatter(x, y, c=col, s=5, linewidth=0)
plt.show()

This produces the plot below:

enter image description here

The line col = np.where(x<1,'k',np.where(y<5,'b','r')) is the important one. This produces a list, the same size as x and y. It fills this list with 'k','b' or 'r' depending on the condition that is written before it. So if x is less than 1, 'k' will be appended to list, else if y is less than 5 'b' will be appended and if neither of those conditions are met, 'r' will be appended to the list. This way, you do not have to use a loop to plot your graph.

For your specific data you will have to change the values in the conditions of np.where.

DavidG
  • 24,279
  • 14
  • 89
  • 82