2

I have a table:

CityData ->

City        Price     Bathrooms      Bedrooms      Porch

Milwaukee   2300      2              3             yes
Chicago     3400      3              2             yes
Springfield 2300      1              1             no
Chicago     2390      2              1             yes

I would like to run a regression for each city (multiple rows per city) to give me coefficients for each city. I want to regress price on the other confounding variables (bathrooms, bedrooms, porch).

I tried the dplyr library:

library(dplyr)

fitted_models = CityData %>% 
    group_by(CityData$City) %>% 
    do(model = lm(CityData$Price ~ CityData$Bathrooms +
                  CityData$Porch + CityData$Bedrooms, data = CityData))

But the output is just

14    lm    list
14    lm    list
14    lm    list

Any suggestions?

smci
  • 32,567
  • 20
  • 113
  • 146
DiamondJoe12
  • 1,879
  • 7
  • 33
  • 81
  • Do you want different intercepts for each city, or a different set of coefficients for bedrooms, bathrooms, etc. for each city? Also, to make your example reproducible, you're going to have to supply enough data to run a regression (more rows than predictors). – alistaire Jun 26 '18 at 01:49
  • I want a different coefficient for the Porch variable. – DiamondJoe12 Jun 26 '18 at 01:55
  • Start with `lm(Price ~ City + Bathrooms + Porch + Bedrooms, CityData)`, and get more complicated as necessary from there. Also, don't use `$` subsetting in formulas or dplyr/tidy eval functions—just use the bare variable name. – alistaire Jun 26 '18 at 02:18
  • I don't understand what this output is. What's 14? Are `lm` and `list` column types? If so, those are the types to be expected from that code, so you need to explain more clearly what you're getting and what you're looking for. – camille Jun 26 '18 at 02:43

1 Answers1

4

You might try something like this. Here I'll use the mtcars data as an example.

df <- mtcars
models <- df %>% group_by(cyl) %>% summarise(mod = list(lm(mpg ~ wt)))

This will give you a new variable mod that holds all the info for your model. You can call the coefficients like:

models$mod[[1]]$coefficients
(Intercept)          wt 
39.571196     -5.647025

You can get more complex with it too.

models <- df %>% group_by(cyl) %>% summarise(mod = list(lm(mpg ~ wt + hp)))
models$mod[[1]]$coefficients
(Intercept)          wt          hp 
45.83607319 -5.11506233 -0.09052672 

Of course models will also still also hold the info for the group

models$cyl
[1] 4 6 8
AndS.
  • 7,748
  • 2
  • 12
  • 17