In the video lab section of Introduction to Statistical Learning, Chapter 3, there are functions to perform regression on one predictor variable. The presenter is Trevor Hastie. The relevant section starts at 19:22.
https://www.youtube.com/watch?v=gNZfqHhq_B4&list=PLoROMvodv4rOzrYsAxzQyHb8n_RWNuS1e&index=14
I would like to extend and modify the code so that the function works when there is a data.frame and the number of predictors is not known in advance. I do not want to use attach().
Below is a function from the presentation.
Then I create two data frames.
I also created three different interactive scenarios that I would like the function to be able to handle. I am looking for help on writing the function.
# Function By Trevor Hastie
library(ISLR2)
# version 1
regplot = function(x,y) {
fit = lm(y~x)
plot(x,y)
abline(fit, col = "red")
}
attach(Carseats)
regplot(Price, Sales)
# My work
numRows = 100
df1 = data.frame(y = rnorm(numRows), x1 = rnorm(numRows))
df2 = data.frame(y = rnorm(numRows), x1 = rnorm(numRows),
x2 = rnorm(numRows), x3 = rnorm(numRows) )
# Case 1
lm.fit1 = lm(y ~ x1, data = df1)
plot(lm.fit1$fitted.values, lm.fit1$residuals)
# Possible function call: regplot(df1, y, x1)
# Case 2
lm.fit2 = lm(y ~ x1 + x2, data = df2)
plot(lm.fit2$fitted.values, lm.fit1$residuals)
# Possible function call: regplot(df1, y, c(x1, x2))
# Case 3
lm.fit3 = lm(y ~ x1 + x2 + x3 + x2:x3, data = df2)
plot(lm.fit3$fitted.values, lm.fit1$residuals)
# Possible function call: regplot(df1, y, c(x1, x2, x3), inter = c(x2, x3))