8

I have a signal, and I would like to color in red point which are too far from the mean of the signal. For example:

k=[12,11,12,12,20,10,12,0,12,10,11]
x2=np.arange(1,12,1)
plt.scatter(x2,k, label="signal")
plt.show()

I would like to color in red the data points 20 and 0, and I give them a special label like "warning". I read matplotlib: how to change data points color based on some variable, but I am not sure how to apply it on my case

Community
  • 1
  • 1
user3841581
  • 2,637
  • 11
  • 47
  • 72

2 Answers2

12

If you want different labels, you need different plots.
Filter your data according to your formula.
In this case I took values which are more than 1.5 standard deviations away from the mean. In case you don't know, in numpy you can use boolean masks to index arrays and only take elemets where the mask is True. You can also easily flip the mask with the complement operator ~.

import matplotlib.pyplot as plt
import numpy as np

k=np.array([12,11,12,12,20,10,12,0,12,10,11])
x2=np.arange(1,12,1)

# find out which parameters are more than 1.5*std away from mean
warning = np.abs(k-np.mean(k)) > 1.5*np.std(k)

# enable drawing of multiple graphs on one plot
plt.hold(True)

# draw some lines behind the scatter plots (using zorder)
plt.plot(x2, k, c='black', zorder=-1)

# scatter valid (not warning) points in blue (c='b')
plt.scatter(x2[~warning], k[~warning], label='signal', c='b')

# scatter warning points in red (c='r')
plt.scatter(x2[warning], k[warning], label='warning', c='r')

# draw the legend
plt.legend()

# show the figure
plt.show()

This is what you get:

enter image description here

swenzel
  • 6,745
  • 3
  • 23
  • 37
  • swendel, what if i wanted to use a plot instead and not the scatter plot? but still want the warning point to be red? – user3841581 Sep 23 '15 at 14:53
  • Well, in that case the easiest solution would be to cheat a bit and add an extra plot that is rendered before the two scatter-plots. It *is* possible to have markers in different colors within one plot but getting the legend right might not be as easy. You probably could create a colorscheme and plot and annotate that but due to simplicity I'd prefer it this way. Once you want to have gradually color change indicating something like a severity, the colorschme thing might be better though. – swenzel Sep 23 '15 at 15:07
  • i have a small issue, assuming that some entries with nan; how do i compute the mean taking into account those nan? – user3841581 Oct 26 '15 at 18:54
  • @user3841581 use [np.nanmean](http://docs.scipy.org/doc/numpy-dev/reference/generated/numpy.nanmean.html). There is also [np.nanstd](http://docs.scipy.org/doc/numpy-dev/reference/generated/numpy.nanstd.html) ;) – swenzel Oct 26 '15 at 22:08
7

If you want just the colors, then try:

import numpy as np
import matplotlib.pyplot as plt

k=[12,11,12,12,20,10,12,0,12,10,11]
x2=np.arange(1,12,1)

# Calculate an outlier limit (I chose 2 Standard deviations from the mean)
k_bar = np.mean(k)
outlier_limit = 2*np.std(k)
# Generate a colour vector
kcolors = ['red' if abs(value - k_bar) > outlier_limit else 'yellow' for value in k]

#Plot using the colour vector
plt.scatter(x2,k, label="signal", c = kcolors)
plt.show()
Hannes Ovrén
  • 21,229
  • 9
  • 65
  • 75
TMrtSmith
  • 461
  • 3
  • 16