Hi I would like some help in the interpretation of results with regards to a regression analysis being conducted:
Here's the reproducible example:
# A tibble: 50 x 11
Country Prefix Year GTD HDI `Population Siz~ `Military Spend~
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Brunei 1 1996 0 0.807 0.305 6.25
2 Brunei 1 1997 0 0.81 0.312 7.18
3 Brunei 1 1998 0 0.812 0.319 7.53
4 Brunei 1 1999 0 0.818 0.326 6.13
5 Brunei 1 2000 0 0.819 0.333 4.07
6 Brunei 1 2001 0 0.82 0.340 3.89
7 Brunei 1 2002 0 0.823 0.347 3.87
8 Brunei 1 2003 0 0.828 0.353 3.71
9 Brunei 1 2004 0 0.834 0.359 2.53
10 Brunei 1 2005 0 0.838 0.365 2.35
# ... with 40 more rows, and 4 more variables: `Voice and Accountability
# (-2.5 to 2.5)` <dbl>, `Education Index (0 to 1)` <dbl>, `Youth
# Unemployment (%)` <dbl>, `GDP Per Capita (In US$)` <dbl>
There are more countries in this dataset and the GTD measures the number of terrorist incidents per year.
Code for Simple Linear Model
PovertyonTerrorism = (PI_Final_Dataset_PS3257_)
PovertyonTerrorism.lm = lm(GTD ~ HDI +
`Population Size (In Millions)` +
`Military Spending (% of GDP)` +
`Voice and Accountability (-2.5 to 2.5)` +
`Youth Unemployment (%)`, data = POT)
Summary(PovertyonTerrorism)
Call:
lm(formula = GTD ~ HDI + `Population Size (In Millions)` + `Military Spending (% of GDP)` +
`Voice and Accountability (-2.5 to 2.5)` + `Youth Unemployment (%)`,
data = POT)
Residuals:
Min 1Q Median 3Q Max
-120.61 -47.41 -28.11 22.41 608.11
Coefficients:
Estimate Std. Error t value
(Intercept) -55.2402 46.3994 -1.191
HDI 276.9661 75.1568 3.685
`Population Size (In Millions)` 0.4785 0.1164 4.113
`Military Spending (% of GDP)` -13.8050 6.0859 -2.268
`Voice and Accountability (-2.5 to 2.5)` 29.5829 11.4481 2.584
`Youth Unemployment (%)` -7.1644 1.4098 -5.082
Pr(>|t|)
(Intercept) 0.235032
HDI 0.000284 ***
`Population Size (In Millions)` 5.4e-05 ***
`Military Spending (% of GDP)` 0.024212 *
`Voice and Accountability (-2.5 to 2.5)` 0.010366 *
`Youth Unemployment (%)` 7.6e-07 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 105.4 on 236 degrees of freedom
Multiple R-squared: 0.1818, Adjusted R-squared: 0.1644
F-statistic: 10.49 on 5 and 236 DF, p-value: 4.202e-09
Code for PCSE Correction
PovertyonTerrorism.pcse = pcse(PovertyonTerrorism.lm, groupN = PovertyonTerrorism $Country, groupT = PovertyonTerrorism $Year)
summary(PovertyonTerrorism.pcse)
Results:
Estimate PCSE t value
(Intercept) -55.2401838 31.5740507 -1.749544
HDI 276.9661000 49.1693320 5.632903
`Population Size (In Millions)` 0.4785356 0.0459366 10.417305
`Military Spending (% of GDP)` -13.8049962 2.6827129 -5.145909
`Voice and Accountability (-2.5 to 2.5)` 29.5828518 13.5261561 2.187085
`Youth Unemployment (%)` -7.1644384 1.1171568 -6.413100
Pr(>|t|)
(Intercept) 8.149689e-02
HDI 5.022592e-08
`Population Size (In Millions)` 3.752064e-21
`Military Spending (% of GDP)` 5.601758e-07
`Voice and Accountability (-2.5 to 2.5)` 2.971846e-02
`Youth Unemployment (%)` 7.704480e-10
---------------------------------------------
# Valid Obs = 242; # Missing Obs = 0; Degrees of Freedom = 236.
Would it mean that although there were statistically significant variables in the linear model, when pcse correction was done, there are no statistically significant variables anymore? I would appreciate any help on clarifying the interpretation of the pcse results :)