I have a dataset with a population variable, as well as a few races ("white", "black", "hispanic"), and I want to be able to loop through the races so that for each race, a "percent_race" variable is created ("percent_white", etc.), and the race variable is then dropped.
I am most familiar with stata, where you can designate the string you are looping through within the loop using a `'. This allows me to name the new variables using a string from my loop that also serves to indicate what variables should be used in the formula for calculating those new variables. Here is what I mean:
loc races white black hispanic
foreach race in races {
generate `race'_percentage = (population/`race')*100
drop `race'
}
In R, I want something to the same effect:
races <- list("white", "black", "hispanic")
df %>%
for (race in races) {
mutate(percent_"race" = (population/race)*100) %>%
select(df, -c(race)) %>%
}
I threw the quotes around race when naming the variable as a filler; I know that doesn't work, but you see how I want the variables to be named.
There might be other things wrong with how I am approaching this in R. I've always done data transformation and analysis in stata and moved to R for visualization, but I'm trying to learn to do it all in R. I'm not even sure if using a for loop within a pipe is proper here, but it makes sense to me within this little problem I have created for myself.