Combining mutate(), case_when() and across() to generate a new variable based on the presence of existing data in r

Question

I have a data.frame with 266 columns, I am trying to create a grouping variable (dichotomized as 1 or 2) based on the responses in the columns of data. My data looks like this:

structure(list(M_Fha_Dha1_1 = c(2, 2, 1, 4, 0), M_Fha_Dha1_2 = c(5, 
4, 4, 4, 1), M_Fla_Dha1_1 = c(NaN, NaN, NaN, NaN, NaN), M_Fla_Dha2_2 = c(NaN, 
NaN, NaN, NaN, NaN)), row.names = c(NA, -5L), class = "data.frame")

I am trying to create myself a new dummy variable (Attract; 1 or 2) based on the responses from existing variables. Each variable has values from 1 to 5, unless the participant did not respond, in which case there is an NA. I have been trying to use dplyr to achieve this. Basically, I want to take all rows with responses to variables titled "M_Fha_Dha_1..." and assign a 1 to my Attract variable. Then, I want to take all rows with responses to variables titled "M_Fla_Dha_1..." and assign a 0 to my Attract variable. The idea is that I create a grouping variable based on which experimental group they were sorted into.

I can achieve this for a single column:

df1 <- df1 %>%
  mutate(Attract = case_when(M_Fha_Dha1_1 >= 1 ~ 1))

However, I don't want to manually write another 249ish lines of code to achieve the same thing for the other columns. I want to mutate() across each relevant column (e.g., I want to be able to select the columns I mutate() over).

To do this I have tried to amend my code with the across() function but I get an error:

table <- table %>%
  mutate(across(M_Fha_Dha1_1:M_Fla_Dha2_2, ~ case_when(Attract = 1 ~ 1)))

Any help would be fantastic! I'm not sure if I have made sense, I can always provide clarity.

Hello Alex Marshall. Welcome to SO. Please add a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) to help getting the best possible answer. — harre, Aug 14 '22 at 11:59
Can you copy the output of `dput(df1)`? This will make it easier to reproduce your problem. — Julian, Aug 14 '22 at 14:49
Hi, I have created a smaller data.frame as an example - sorry I didn't do this initially! — Alex Marshall, Aug 14 '22 at 22:08
What does your expected output look like? Do you want 249 new attract columns, one for each input variable? Also, can you show a bigger data frame dput, maybe 30 rows? Seeing all the columns as the same doesn't illustrate any differences between columns. — dcsuka, Aug 15 '22 at 04:57
Hi, the output should be a single new variable (Attract) populated with 1s and 0s. Where 1s represents individuals that responded to the variables with names including Fha. Individuals that did not respond to variables with names including Fha should get a 0 in the new Attract variable. The structures of the rows is the same with 5 rows or 30 rows, what would 30 rows show that 5 does not? Sorry for my confusion. — Alex Marshall, Aug 15 '22 at 05:44
Alex, can you add that last comment about expected output to the question? Welcome to SO, it's awesome that you're updating as you go! — Michael Roswell, Aug 16 '22 at 18:40

Combining mutate(), case_when() and across() to generate a new variable based on the presence of existing data in r

0 Answers0