how are you?
So, I have a dataset that looks like this:
dirtax_trev indtax_trev lag2_majority pub_exp
<dbl> <dbl> <dbl> <dbl>
0.1542 0.5186 0 9754
0.1603 0.4935 0 9260
0.1511 0.5222 1 8926
0.2016 0.5501 0 9682
0.6555 0.2862 1 10447
I'm having the following problem. I want to execute a series of t.tests along a dummy variable (lag2_majority), collect the p-value of this tests, and attribute it to a vector, using a pipe.
All variables that I want to run these t-tests are selected below, then I omit NA values for my t.test variable (lag2_majority), and then I try to summarize it with this code:
test <- g %>%
select(dirtax_trev, indtax_trev, gdpc_ppp, pub_exp,
SOC_tot, balance, fdi, debt, polity2, chga_demo, b_gov, social_dem,
iaep_ufs, gini, pov4, informal, lab, al_ethnic, al_language, al_religion,
lag_left, lag2_left, majority, lag2_majority, left, system, b_system,
execrlc, allhouse, numvote, legelec, exelec, pr) %>%
na.omit(lag2_majority) %>%
summarise_all(funs(t.test(.[lag2_majority], .[lag2_majority == 1])$p.value))
However, once I run this, the response I get is: Error in summarise_impl(.data, dots): Evaluation error: data are essentially constant.
, which is confusing since there is a clear difference on means along the dummy variable. The same error appears when I replace the last line of the code indicated above with: summarise_all(funs(t.test(.~lag2_majority)$p.value))
.
Alternatively, since all I want to do is: t.test(dirtax_trev~lag2_majority, g)$p.value
, for instance, I thought I could do a loop, like this:
for (i in vars){
t.test(i~lag2_majority, g)$p.value
}
,
Where vars is an object that contains all variables selected in code indicated above. But once again I get an error message. Specifically, this one: Error in model.frame.default(formula = i ~ lag2_majority, data = g): comprimentos das variáveis diferem (encontradas em 'lag2_majority')
What am I doing wrong?
Best Regards!