Is there a way of creating a loop that will create a new variable for each of the original 18 variables?

Question

I have a data set with 4 variables, one of these variables is a dummy stating whether the individual graduated from a particular program (exits). I need to create a loop that will, for each of the 3 variables create two new variables (mean for dummy = 1 and mean for dummy = 0). This is my code, I want to make it more efficient, since afterwards I want to create a new data.frame for exits == 0 and substract both!.

 summary_means_1 = bf %>%
 filter(exits == 1) %>% 
 summarise(
 v1_1 = as.double(mean(bf$v25_grad, na.rm = TRUE)),
 v2_1 = as.double(mean(bf$v29_read, na.rm = TRUE)),
 v3_1 = as.double(mean(bf$v30_math, na.rm = TRUE))
 )

This will be easier to answer with some [example data and expected output](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). — neilfws, Feb 10 '19 at 22:36
This is unclear. Please clarify and give a reproducible example which illustrates the core problem. — John Coleman, Feb 10 '19 at 22:36
https://stackoverflow.com/questions/11952706/generate-a-dummy-variable might have what you need — JustGettinStarted, Feb 10 '19 at 22:36
Using your new code you don't need the df$ in the summarise, and you can `group_by` instead of filtering which will give you the means for both 0 and 1 at the same time. See my answer for how it will look — morgan121, Feb 10 '19 at 23:22

morgan121 · Answer 1 · 2019-02-10T22:45:50.250

You can do this with the plyr package:

Say this is your data (simplified):

df <- data.frame(Dummy=sample(0:1, 10, T), V1=rnorm(10, 10), V2=rpois(10, 0.5))

This code will calculate the mean of each column, split by dummy:

library(magrittr)
library(plyr)
df %>% 
   group_by(Dummy) %>% 
   summarise(Mean_V1=mean(V1, na.rm = T), 
             Mean_V2=mean(V2, na.rm = T))

You'll need to add a new row in the summarise section for each column.

Using base R you can use colMeans with subsetted data:

colMeans(df[df$Dummy==0, -1])
colMeans(df[df$Dummy==1, -1])

Or you could combine them like this:

data.frame(Col=c("V1", "V2"), 
           Mean_0=colMeans(df[df$Dummy==0, -1]), 
           Mean_1=colMeans(df[df$Dummy==1, -1]))

Is there a way of creating a loop that will create a new variable for each of the original 18 variables?

1 Answers1