7

This is probably a trivial thing to do in R, but I can't find any built-in function that does that:

How to transform a vector of values (let's say numeric values) to a vector of colours, given a colour ramp.

Zabba
  • 64,285
  • 47
  • 179
  • 207
Pierre
  • 1,015
  • 1
  • 9
  • 19
  • Can you give some example data? How are the numerics represented: 1 number or separate values for RGBA channels? What colour ramp do you want (you say given, so can you give us one to work with?) – Gavin Simpson Jun 12 '11 at 07:35
  • Let's say I want to match up each value in the vector `mtcars$mpg` with the colour ramp `terrain.colors(n = 64)`. What I need it basically to translate each value in `mtcars$mpg` into its colour if plotted using the colour ramp `terrain.colors(n = 64)`. – Pierre Jun 12 '11 at 07:37
  • How do you want to do the matching? Assign the first colour to the lowest value of `mpg`, the last value of `mpg` to the last colour and then assign each of the other colours to one of the terrain colours by binning the range of values into 64 groups? – Gavin Simpson Jun 12 '11 at 08:11
  • Yes - I was thinking of a simple linear interpolation: first colour would be affected to the minimum value, while last colour would represent the maximum value in the vector. Each of the other values in the vector would be given according to a linear binning in 64 colours. – Pierre Jun 12 '11 at 08:15
  • OK, thanks. Have included an answer that does that. – Gavin Simpson Jun 12 '11 at 08:23

1 Answers1

7

The idea in my comment above could be implemented via seq() and cut(). The first step is to create equal interval bins over the range of the data

brks <- with(mtcars, seq(min(mpg), max(mpg), length.out = 65))

We now assign each of the observations to one of the bins formed by the breaks brks

grps <- with(mtcars, cut(mpg, breaks = brks, include.lowest = TRUE))

grps is a factor indicating to which bin each observation in the data is assigned

> head(grps)
[1] (20.7,21]   (20.7,21]   (22.5,22.9] (21,21.4]   (18.5,18.8]
[6] (17.7,18.1]
64 Levels: [10.4,10.8] (10.8,11.1] (11.1,11.5] ... (33.5,33.9]

As grps is stored internally as an integer 1:nlevels(grps)) we can use this integer to index the colour ramp:

> terrain.colors(64)[grps]
 [1] "#CADF00FF" "#CADF00FF" "#E6D90EFF" "#D3E100FF" "#96D300FF"
 [6] "#85CF00FF" "#3CBA00FF" "#E8C133FF" "#E6D90EFF" "#9ED500FF"
[11] "#85CF00FF" "#67C700FF" "#76CB00FF" "#51C000FF" "#00A600FF"
[16] "#00A600FF" "#43BC00FF" "#F1D6D3FF" "#EFBEACFF" "#F2F2F2FF"
[21] "#DCE300FF" "#51C000FF" "#51C000FF" "#29B400FF" "#9ED500FF"
[26] "#EBB16EFF" "#EAB550FF" "#EFBEACFF" "#58C300FF" "#AFD900FF"
[31] "#4ABE00FF" "#D3E100FF"

An alternative might be to use colourRamp(), which returns a function that interpolates a set of provided colours. The produced function takes values in the range [0,1) that cover the range of the colour ramp. First we produce a colour ramp function that interpolates between the colours red and blue:

FUN <- colorRamp(c("red","blue"))

We then take our input data and get it on to a [0,1) interval:

MPG <- with(mtcars, (mpg - min(mpg)) / diff(range(mpg)))

We then use FUN() to generate the colours:

> cols <- FUN(MPG)
> head(cols)
         [,1]      [,2] [,3]
[1,] 139.9787 115.02128    0
[2,] 139.9787 115.02128    0
[3,] 120.4468 134.55319    0
[4,] 135.6383 119.36170    0
[5,] 164.9362  90.06383    0
[6,] 171.4468  83.55319    0

The object returned by FUN() is a matrix of values for the Red, Green, and Blue channels. To convert these to hex codes that R can use, we employ the rgb() function:

> rgb(cols, maxColorValue=256)
 [1] "#8B7300" "#8B7300" "#788600" "#877700" "#A45A00" "#AB5300"
 [7] "#D42A00" "#679700" "#788600" "#9F5F00" "#AE5000" "#BD4100"
[13] "#B34B00" "#CA3400" "#FE0000" "#FE0000" "#D02E00" "#10EE00"
[19] "#26D800" "#00FE00" "#867800" "#C73700" "#CA3400" "#DF1F00"
[25] "#9F5F00" "#47B700" "#55A900" "#26D800" "#C43A00" "#996500"
[31] "#CC3200" "#877700"
Gavin Simpson
  • 170,508
  • 25
  • 396
  • 453
  • 1
    It is worth pointing out that if you are intending to use the above to do some plotting, the lattice and ggplot2 packages provide consistent plotting functions that can do the above operations for you, especially ggplot2. – Gavin Simpson Jun 12 '11 at 09:23
  • 1
    @Gavin (+1) I think `colorRampPalette()` might also be a good option. – chl Jun 12 '11 at 10:30
  • 2
    @chl Good pointer, though `colorRamp()` seems a better fit - `colorRampPalette()` is similar to `terrain.colors()` etc. - as you provide it the colours to interpolate and it returns a function to provide the colours for a supplied numeric value in range [0,1). – Gavin Simpson Jun 12 '11 at 12:17
  • 1
    FYI colorRamp is really slow in my experience. – hadley Jun 12 '11 at 14:50
  • @Gavin Yes, I'm a heavy ggplot2 user myself. I'm actually writing a package for KML generation, that's why I'm not using ggplot2 routines. However, what I would like to implement is *exactly* Hadley's approach ; the user is specifying a color ramp (eg using RColorBrewer) and a variable, and the function is affecting the colours automatically. There must be such a system in ggplot2, but I couldn't find it in the source. Maybe Hadley could comment on that? – Pierre Jun 13 '11 at 00:22