I'm using the following code to try to transform my response variable for regression. Seems to need a log transformation.
bc = boxCox(auto.tf.lm)
lambda.mpg = bc$x[which.max(bc$y)]
auto.tf.bc <- with(auto_mpg, data.frame(log(mpg), as.character(cylinders), displacement**.2, log(as.numeric(horsepower)), log(weight), log(acceleration), model_year))
auto.tf.bc.lm <- lm(log(mpg) ~ ., data = auto.tf.bc)
view(auto.tf.bc)
I am receiving this error though.
Error in Math.data.frame(mpg) :
non-numeric variable(s) in data frame: manufacturer, model, trans, drv, fl, class
Not sure how to resolve this. The data is in a data frame, not csv.
Here's the output from str(auto.tf.bc). Sorry for such bad question formatting.
'data.frame': 392 obs. of 7 variables:
$ log.mpg. : num 2.89 2.71 2.89 2.77 2.83 ...
$ as.character.cylinders.: chr "8" "8" "8" "8" ...
$ displacement.0.2 : num 3.14 3.23 3.17 3.14 3.13 ...
$ log.horsepower. : num 4.87 5.11 5.01 5.01 4.94 ...
$ log.weight. : num 8.16 8.21 8.14 8.14 8.15 ...
$ log.acceleration. : num 2.48 2.44 2.4 2.48 2.35 ...
$ model_year : num 70 70 70 70 70 70 70 70 70 70 ...
removing the cylinders doesn't change anything.