1

I have 6 scatter plots created using the "pairs" function, and I want to plot the linear regression line for each scatter plot on top of each respective scatter plot.

I have tried writing my own function and using this function as the argument for upper.panel

This is the code that produces my scatter plots on the upper diagonal. Each color represents a different class of iris flower.

pairs(iris_data_excel[1:4], lower.panel=NULL, col=c("red","blue","green") 
[class_to_number])

Plot created from the above code

This is the function I wrote and tried to use as an argument for upper.panel

upper_panel_regression_line = function(x,y){
  linear_regression = lm(x, y)
  linear_regression_line = abline(linear_regression)
}

This is where I input the function for the argument "upper.panel"

pairs(iris_data_excel[1:4], lower.panel=NULL, upper.panel = 
upper_panel_regression_line, col=c("red","blue","green")[class_to_number])

This is the error I get

Error in lower.panel(...)
unused argument (col = c("red", "blue", "green")[class_to_number])

Example which can be used to reproduce the plot using the built in iris dataset:

#Extracts the iris species column from the iris dataset
iris_class = iris$Species

#Change the class from characters to numerical values to be used for
#indexing
# #1 = Iris-setosa
# #2 = Iris-versicolor
# #3 = Iris-virginica
class_to_number = as.numeric(factor(iris_class))

#Scatter plot matrix
pairs(iris[1:4], lower.panel=NULL, col=c("red","blue","green") 
[class_to_number])
  • What is `iris_data_excel` and `class_to_number`? It's easier to help you if you provide a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input that can be used to test the code. Maybe you just need to change your function signature to `function(x,y, ...)` to ignore the extra parameters. Also that's not how you call `lm()`. It should look more like `lm(y~x)` – MrFlick Jun 14 '19 at 18:40
  • Sorry, iris_data_excel is the data set that I am plotting and am taking from an Excel spreadsheet on my computer. However, R has the same data set built in, so I have edited my post with a reproducible example using this built in data set – Anand Patel Jun 14 '19 at 18:47

1 Answers1

0

Here's how you can update your function

upper_panel_regression_line = function(x,y, ...){
  points(x,y,...)
  linear_regression = lm(y~x)
  linear_regression_line = abline(linear_regression)
}


pairs(iris[1:4], lower.panel=NULL, upper.panel = 
        upper_panel_regression_line, col=c("red","blue","green")[class_to_number])

Since you are replacing the upper.panel function, you need to then draw the points yourself which is what the points() is in there for. Also your function needs to accept the col parameter which we do via the ... Finally, we use a formula with lm() like it expects.

enter image description here

MrFlick
  • 195,160
  • 17
  • 277
  • 295