2

I have a data frame(mappedUn) of the structure:

C1  C2  C3  C4  C5  C6
1   1   1   3   1   1
3   3   3   16  3   3
10  NA  10  NA  6   6
11  NA  11  NA  10  11
NA  NA  NA  NA  11  NA
NA  NA  NA  NA  12  NA

note :I have stripped the entries in the above example to fit it here ,also I have replaced the column names to make it simpler

I was wondering if there is a way to color code scatter plots in R, I am using the pairs method to plot different scatter plots, The method I run is :

pairs(mappedUn[1:6])

Here is what I get:

enter image description here

Notice some graphs have two points some have 3 and so on...Is there a way to add different background color to each of the plot in the above graph based on how many point it has , for instance 4 points- red, 3-yellow,2 green etc

My ultimate goal is to visually distinguish the plots with high number of common points

Snedden27
  • 1,870
  • 7
  • 32
  • 60
  • 1
    Please show a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) of the code you used to create the plot (that way an answerer could adapt the code to provide an answer) – David Robinson Mar 09 '15 at 16:38
  • yes you are right, I have added few line of the code now – Snedden27 Mar 09 '15 at 17:16

1 Answers1

3

The key here is to customize the parameter panel inside pairs(). Try the following to see whether it meets your requirement.

n.notNA <- function(x){
  # define the function that returns the number of non-NA values
  return(length(x) - sum(is.na(x)))
}
myscatterplot <- function(x, y){
  # ll is used for storing the parameters for plotting region
  ll <- par("usr") 
  # bg is used for storing the color (an integer) of the background of current panel, which depends on the number of points. When x and y have different numbers of non-NA values, use the smaller one as the value of bg.
  bg <- min(n.notNA(x), n.notNA(y))
  # plot a rectangle framework whose dimension and background color are given by ll and bg
  rect(ll[1], ll[3], ll[2], ll[4], col = bg)
  # fill the rectangle with points
  points(x, y)
}
# "panel = myscatterplot" means in each panel, the plot is given by "myscatterplot()" using appropriate combination of variables
  pairs(data, panel = myscatterplot)

A related question : R: How to colorize the diagonal panels in a pairs() plot?

Community
  • 1
  • 1
JellicleCat
  • 180
  • 2
  • 6
  • it does work ,I am trying to figure what exactly is happening here tho. I still am trying to get use to R – Snedden27 Mar 10 '15 at 12:35