I am working on a very large dataset and have laid out a simple version below
group <- c(rep("A", 3), rep("B", 3), rep("C", 3))
X <- c(0, 1, 2, 0, 1, 2, 0, 1, 2)
Y <- c(0, 2, 4, 0, 3, 6, 0, 4, 8)
df <- data.frame(group, X, Y)
I am attempting to obtain, through linear regression, the coefficients of three lines corresponding to groups A, B, and C (factor variables)... with little luck from the below code...
I came across some R code where a ' * ' sign was suggested to be used on the independent variable to (in the case of this example) calculate the slope of line A, B, and C. A, B, and C being a factor variable.
lin.reg <- lm(Y ~ X*group, data = df)
coefficients_for_ABC <- summary(lin.reg)
I think this code I came across is incorrect and that I need to apply a by function or similar.