2

Here is my sample data:

df <- structure(list(Make = c("Ford", "Nissan", "Volkswagen", "Chevrolet", 
"Chevrolet", "GMC", "Mazda", "Ford", "Chevrolet", "Ford", "Cadillac", 
"Ford", "Grand Cherokee", "Volkswagen", "Chevrolet", "Toyota", 
"Toyota", "Honda", "Toyota", "Audi"), Model = c(2011L, 2011L, 
2012L, 2011L, 2010L, 2011L, 2010L, 2010L, 2012L, 2012L, 2012L, 
2010L, 2010L, 2010L, 2011L, 2010L, 2010L, 2011L, 2011L, 2011L
), Highway_mpg = c(15L, 20L, 27L, 20L, 23L, 25L, 26L, 17L, 26L, 
25L, 27L, 21L, 20L, 30L, 21L, 18L, 28L, 26L, 20L, 19L), City_mpg = c(11L, 
16L, 21L, 14L, 17L, 18L, 20L, 12L, 16L, 18L, 18L, 15L, 14L, 22L, 
15L, 14L, 19L, 17L, 15L, 12L)), row.names = c(NA, -20L), class = ("data.frame"))

Here is my desired output (assume NA's are the predicted values):

enter image description here

I want to calculate the average fuel consumption for each car model year, whilst also predicting the averages for years 2013 to 2016.

What I have tried:

I tried following the answer to this question using the following code:

cars_model <- lm(Model ~ Highway_mpg + City_mpg, data = df)

years <- data.frame(Model = c(2013:2016))

res <- predict(cars_model, years)

Error in eval(predvars, data, env) : object 'Highway_mpg' not found

After reading the error, I then tried to add the fuel consumption columns to my new df, but still got errors.

k3b
  • 344
  • 3
  • 15

1 Answers1

1

You predict Year using Highway_mpg and City_mpg. If you want result like that blue table above and also, since you focus on average fuel consumption for each car model year, you'd better try this way.

mod1 <- lm(Highway_mpg ~ Model, data = dummy)
mod2 <- lm(City_mpg ~ Model, data = dummy)
years <- data.frame(Model = c(2013:2016))

data.frame(
  Model = years$Model,
  Highway_mpg = predict(mod1, years),
  City_mpg = predict(mod2, years)
  
)

  Model Highway_mpg City_mpg
1  2013    25.21429 17.14286
2  2014    26.35714 17.57143
3  2015    27.50000 18.00000
4  2016    28.64286 18.42857
Park
  • 14,771
  • 6
  • 10
  • 29