For future postings, it is good practice to always include sample data. See here how to provide a minimal reproducible example/attempt including sample data.
That aside, here is a simple example based on some sample data I generate.
# Generate sample data
set.seed(2017);
x <- as.numeric(gl(2, 10, 20));
z <- 1:20;
y <- 4 * x + 0.5 * z + rnorm(20);
# Fit model
fit <- lm(y ~ as.factor(x) + z + as.factor(x) * z);
summary(fit);
#
#Call:
#lm(formula = y ~ as.factor(x) + z + as.factor(x) * z)
#
#Residuals:
# Min 1Q Median 3Q Max
#-1.9283 -0.4702 -0.1270 0.7932 1.6648
#
#Coefficients:
# Estimate Std. Error t value Pr(>|t|)
#(Intercept) 4.13695 0.79828 5.182 9.08e-05 ***
#as.factor(x)2 5.72079 2.17955 2.625 0.01839 *
#z 0.47615 0.12865 3.701 0.00194 **
#as.factor(x)2:z -0.09588 0.18195 -0.527 0.60544
#---
#Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
#Residual standard error: 1.169 on 16 degrees of freedom
#Multiple R-squared: 0.9522, Adjusted R-squared: 0.9432
#F-statistic: 106.3 on 3 and 16 DF, p-value: 8.896e-11
# Predict for x = 1, and y = 1:5
predict(fit, newdata = data.frame(x = 1, z = 1:5));
#1 2 3 4 5
#4.613097 5.089242 5.565388 6.041533 6.517679
Note that if you want to predict
the response based on new values of your predictor variables, you need to supply a newdata
data.frame
. Otherwise, predict
will predict the response based on your original data.