I would like to generate a matrix of scatterplots from the following data frame.
# Generate some fake data
set.seed(123)
fakeData <- rnorm(10)
df <- data.frame(Type=c(rep("A", 5), rep("B", 5)),
Syst=fakeData, Bio=2*fakeData, Blr=fakeData^2)
If I use the pairs
function, I get scatterplots both below and above the diagonal of my scatterplot matrix.
I do want to keep the scatterplots in the upper panel, however, I would like to "plot" the correlation coefficient of my data in the lower panel.
I have looked for an answer online, and despite finding some good explanation, I have had no success so far. Like here,and here, here too, and here as well. While elucidating, these examples don't go over cases when there are data with different levels in the data frame.
As my data indicate, there are two levels in my data frame, "A" and "B". Hence, I'd like to have two correlation coefficient in each "box" of my lower panel, one for the data whose level is A and another for the data whose level is B. For instance, in plotting pairs(df[2:4]), I'd like to see these two coefficients in the first box of the second line (lower panel) of my matrix.
This line of code
pairs(df[2:4], main="", pch=21, bg=c("red","blue"), lower.panel=NULL)
will plot the scatterplot matrix on the upper panel. By assign color options to bg
, I can differentiate between A and B data points. Ideally, my Pearson correlation coefficient will be plotted in the same as their respective data were.
Attempt # 1 - I took the commented function below and changed a bit so as to accommodate the changes needed for the desired result.
# panel.cor <- function(x, y, digits=2, prefix="", cex.cor, ...)
# {
# usr <- par("usr"); on.exit(par(usr))
# par(usr = c(0, 1, 0, 1))
# r <- abs(cor(x, y))
# txt <- format(c(r, 0.123456789), digits=digits)[1]
# txt <- paste(prefix, txt, sep="")
# if(missing(cex.cor)) cex.cor <- 2
# text(0.5, 0.5, txt, cex = cex.cor)
# }
I know my data frame "df" has 10 rows. Suppose I want to print the correlation of only the data whose level is A in the lower panel. I thought of changing x and y dimensions to restrict both variables to take only level-A data.
panel.cor <- function(x, y, digits=2, prefix="", cex.cor, ...)
{
x <- x[1:5,1:3]
y <- y[1:5,1:3]
usr <- par("usr"); on.exit(par(usr))
par(usr = c(0, 1, 0, 1))
r <- abs(cor(x, y))
txt <- format(c(r, 0.123456789), digits=digits)[1]
txt <- paste(prefix, txt, sep="")
if(missing(cex.cor)) cex.cor <- 2
text(0.5, 0.5, txt, cex = cex.cor)
}
Unfortunately, this didn't work either. I get an error message that says incorrect number of dimensions