5

I have a simple scatter plot

x<-rnorm(100)
y<-rnorm(100)
z<-rnorm(100)

I want to plot the plot(x,y) but the color of the points should be color coded based on z.

Also, I would like to have the ability to define how many groups (and thus colours) z should have. And that this grouping should be resistant to outliers (maybe split the z density into n equal density groups).

Till now I do this manually, is there any way to do this automatically?

Note: I want to do this with base R not with ggplot.

zx8754
  • 52,746
  • 12
  • 114
  • 209
ECII
  • 10,297
  • 18
  • 80
  • 121

1 Answers1

11

You can pass a vector of colours to the col parameter, so it is just a matter of defining your z groups in a way that makes sense for your application. There is the cut() function in base, or cut2() in Hmisc which offers a bit more flexibility. To assist in picking reasonable colour palettes, the RColorBrewer package is invaluable. Here's a quick example after defining x,y,z:

z.cols <- cut(z, 3, labels = c("pink", "green", "yellow"))
plot(x,y, col = as.character(z.cols), pch = 16)

enter image description here

You can obviously add a legend manually. Unfortunately, I don't think all types of plots accept vectors for the col argument, but type = "p" obviously works. For instance, plot(x,y, type = "l", col = as.character(z.cols)) comes out as a single colour for me. For these plots, you can add different colours with lines() or segments() or whatever the low level plotting command you need to use is. See the answer by @Andrie for doing this with type = "l" plots in base graphics here.

Community
  • 1
  • 1
Chase
  • 67,710
  • 18
  • 144
  • 161
  • Does cut() split in equal content intervals? – ECII Dec 12 '11 at 13:57
  • @ECII - not by default. `Hmisc:::cut2()` provides the `m` argument which sets the minimum number of observations in each group makes this easy however. Something like `z.cols2 <- cut2(z, m = length(z) / 3)` should do the trick. – Chase Dec 12 '11 at 14:03
  • Shouldn't the g argument in cut2 be better? cut2(z, g=3) ? – ECII Dec 12 '11 at 14:23
  • As a side note: if you want to set your colors based on the y-values, obviously just put y instead of z into those `cut` tools. – Carl Witthoft Dec 12 '11 at 14:49
  • @ECII - probably 6 in one, 1/2 dozen in the other...I've always used the `m` argument before, but you're right that `g` is probably more aptly suited for this task. Thanks for the heads up! – Chase Dec 12 '11 at 14:59