Getting mean score for each group from linear regression output

Question

I run the linear regression predicting life satisfaction by sex, race and its interaction.

lm2 <-lm(nids$satisfaction~nids$male+nids$race+nids$male:nids$race)

Here is an output:

Call:
lm(formula = nids$satisfaction ~ nids$male + nids$race + nids$male:nids$race)

Residuals:
    Min      1Q  Median      3Q     Max 
-6.6613 -1.3366 -0.0485  1.7378  4.9515 

Coefficients:
                    Estimate Std. Error t value Pr(>|t|)    
(Intercept)          4.17751    0.05467  76.410  < 2e-16 ***
nids$male            0.39318    0.08564   4.591 4.45e-06 ***
nids$race            0.87095    0.03421  25.459  < 2e-16 ***
nids$male:nids$race -0.17947    0.05261  -3.411 0.000649 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 2.358 on 12016 degrees of freedom
Multiple R-squared:  0.07414,   Adjusted R-squared:  0.07391 
F-statistic: 320.7 on 3 and 12016 DF,  p-value: < 2.2e-16

I'm required to provide the mean score of life satisfaction for (1) each sex group as well as for (2) each race group (4 in total).

So, how can I do it using R? I know that I can just aggregate the data but there is a hint that I can use some coefficients to figure out the mean of satisfaction level for both sex and race groups.

Thank you very much in advance.

Please provide a reproducible example. There are numerous data sets implemented in R to do so. — lukeA, Dec 12 '15 at 16:14
Welcome to StackOverflow. A [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) would be helpful — polka, Dec 12 '15 at 16:17
The easiest would be to abstract from your data and using a built in set: `help(pack="datasets")`. — lukeA, Dec 12 '15 at 16:24
Updated my post with the links to data as well as to .r file. — Ruslan Seletskiy, Dec 12 '15 at 16:25

score 0 · Accepted Answer · answered Dec 12 '15 at 16:36

0

One quick way of doing it:

malenids <- nids[nids$male == 1, ]
femalenids <- nids[nids$male == 0, ]
lapply(split(malenids, malenids$race), function(x) mean(x$satisfaction))
lapply(split(femalenids, femalenids$race), function(x) mean(x$satisfaction))

answered Dec 12 '15 at 16:36

Raad

2,675
1
13
26

Yes, but that question is how to do it using only this linear regression out put :(, is there any principle behind it? – Ruslan Seletskiy Dec 12 '15 at 16:46
Ok misinterpreted. What is the reason behind this? Why don't you just generate your predictions and then aggregate or compute it in-sample as above? – Raad Dec 12 '15 at 16:52
But what do you mean by 'generate your predictions'? – Ruslan Seletskiy Dec 12 '15 at 16:55
Using `predict(lm2)` or `fitted(lm2)` as you haven't supplied any new data. – Raad Dec 12 '15 at 16:57
Now I completely don't understand what you mean by saying it. If I have only linear regression output, what the coefficients from that can I use to calculate the mean for each race and sex group. Again, only giving placed above linear regression output. – Ruslan Seletskiy Dec 12 '15 at 17:02
1

@dmg Check out `fit <- lm(weight ~ group, PlantGrowth); c(coefficients(fit)[1], coefficients(fit)[1] + coefficients(fit)[-1]); aggregate(weight ~ group, PlantGrowth, mean)` or `fit <- lm(mpg ~ as.factor(cyl), mtcars); c(coefficients(fit)[1], coefficients(fit)[1] + coefficients(fit)[-1]); aggregate(mpg ~ as.factor(cyl), mtcars, mean)`. – lukeA Dec 12 '15 at 17:06

Getting mean score for each group from linear regression output

1 Answers1