Is there a way to write a shorthand formula for all but few variable?
e.g.,
Instead of
modreg_trein <- lm(Life.expectancy ~ Status + Life.expectancy + Adult.Mortality + infant.deaths + percentage.expenditure + Hepatitis.B + Measles + BMI + under.five.deaths + Polio + Diphtheria + HIV.AIDS + GDP + Population + thinness..1.19.years + thinness.5.9.years + Income.composition.of.resources + Schooling , life_2015_clean)
I would like to write something like
modreg_trein <- lm(Life.expectancy ~ . - Alcohol - Total.expenditure, data = life_2015_clean)
EDIT: MWE
Data available in: https://www.kaggle.com/augustus0498/life-expectancy-who?select=led.csv
Procedure to reproduction:
life <- read.csv('./data/csv/Life_Expectancy_Data.csv')
life_2015 <- subset(life, Year=="2015")
life_2015_clean <- subset(life_2015, select=-c(Country, Year))
life_2015_clean$Status <- as.numeric(as.factor(life_2015_clean$Status))
Finally, manually inputting all variables but Alcohol and Total.expenditure, gives a successful regression.
modreg_trein <- lm(Life.expectancy ~ Status + Adult.Mortality + infant.deaths + percentage.expenditure + Hepatitis.B + Measles + BMI + under.five.deaths + Polio + Diphtheria + HIV.AIDS + GDP + Population + thinness..1.19.years + thinness.5.9.years + Income.composition.of.resources + Schooling , life_2015_clean)
summary(modreg_trein)
Call:
lm(formula = Life.expectancy ~ Status + Adult.Mortality + infant.deaths +
percentage.expenditure + Hepatitis.B + Measles + BMI + under.five.deaths +
Polio + Diphtheria + HIV.AIDS + GDP + Population + thinness..1.19.years +
thinness.5.9.years + Income.composition.of.resources + Schooling,
data = life_2015_clean)
Residuals:
Min 1Q Median 3Q Max
-7.3326 -1.4047 0.0247 1.5478 7.9440
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.083e+01 3.024e+00 16.807 < 2e-16 ***
Status -3.824e-01 8.332e-01 -0.459 0.6472
Adult.Mortality -2.077e-02 3.619e-03 -5.738 8.29e-08 ***
infant.deaths 6.626e-02 3.291e-02 2.013 0.0465 *
percentage.expenditure 5.575e-03 7.362e-03 0.757 0.4505
Hepatitis.B 4.311e-02 2.264e-02 1.904 0.0595 .
Measles -5.027e-05 5.741e-05 -0.876 0.3832
BMI -9.085e-03 1.554e-02 -0.585 0.5600
under.five.deaths -4.811e-02 2.359e-02 -2.040 0.0437 *
Polio 1.179e-02 1.271e-02 0.928 0.3553
Diphtheria -1.148e-02 2.636e-02 -0.435 0.6641
HIV.AIDS -4.858e-01 2.243e-01 -2.166 0.0324 *
GDP 5.950e-06 3.011e-05 0.198 0.8437
Population -7.918e-10 9.586e-09 -0.083 0.9343
thinness..1.19.years -1.192e-01 2.343e-01 -0.509 0.6119
thinness.5.9.years -2.030e-02 2.291e-01 -0.089 0.9296
Income.composition.of.resources 3.331e+01 4.991e+00 6.674 9.93e-10 ***
Schooling -5.244e-02 2.407e-01 -0.218 0.8279
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2.7 on 112 degrees of freedom
(53 observations deleted due to missingness)
Multiple R-squared: 0.901, Adjusted R-squared: 0.886
F-statistic: 59.99 on 17 and 112 DF, p-value: < 2.2e-16
But, this doesn't:
modreg_trein <- lm(Life.expectancy ~ . - Alcohol - Total.expenditure, life_2015_clean)
summary(modreg_trein)
Output:
Call:
lm(formula = Life.expectancy ~ . - Alcohol - Total.expenditure,
data = life_2015_clean)
Residuals:
ALL 2 residuals are 0: no residual degrees of freedom!
Coefficients: (16 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 82.81164 NaN NaN NaN
Status NA NA NA NA
Adult.Mortality -0.06772 NaN NaN NaN
infant.deaths NA NA NA NA
percentage.expenditure NA NA NA NA
Hepatitis.B NA NA NA NA
Measles NA NA NA NA
BMI NA NA NA NA
under.five.deaths NA NA NA NA
Polio NA NA NA NA
Diphtheria NA NA NA NA
HIV.AIDS NA NA NA NA
GDP NA NA NA NA
Population NA NA NA NA
thinness..1.19.years NA NA NA NA
thinness.5.9.years NA NA NA NA
Income.composition.of.resources NA NA NA NA
Schooling NA NA NA NA
Residual standard error: NaN on 0 degrees of freedom
(181 observations deleted due to missingness)
Multiple R-squared: 1, Adjusted R-squared: NaN
F-statistic: NaN on 1 and 0 DF, p-value: NA