Merging multiple columns based on conditions in R

Question

I have a data frame with several columns I want to merge only 5 of those.

ID	result___1	result___3	result___4	result___5
1	First	0	0	0
3	Second	First	0	0
4	First	First	0	0
5	Third	First	Second	0
6	Second	First	0	Other
7	Second	First	0	Other

The merge of those will be in a new columne, seperated by "," if needed. For example for the first raw I want to keep only "First", for the second raw "Second, First" etc. My last column has text where it is different each time (because it was filled by interviewees). So the results it could be "Second, First, the interviewees text".

Following <https://stackoverflow.com/questions/50874955/merging-multiple-columns-in-a-dataframe-based-on-condition-in-r >

This is my code until now. But I don't get what I want

df %>% mutate_at(vars(starts_with("result___")), funs(na_if(.,"0"))) %>%
  mutate(demo_race = coalesce(df$result___1,df$result___2,df$result___3,df$result___4,df$result___5)) %>%
  select(ID, df$resultsnew)

Any hint how can I solve my problem?

Thank you in advance

Did I answer your question below? If so, can you please click the green check mark next to it so that others know this question is closed? Thanks! — NovaEthos, Oct 10 '21 at 17:53

NovaEthos · Answer 1 · 2021-10-08T18:10:16.600

First, I recreated your data:

df <- data.frame(
  ID = c(1,3,4,5,6,7),
  result__1 = c("First", "Second", "First", "Third","Second", "Second"),
  result__2 = c(0,0,0,0,0,0),
  result__3 = c(0, "First","First","First","First","First"),
  result__4 = c(0,0,0,"Second",0,0),
  result__5 = c(0,0,0,0,"Other","Other"),
  text = c(rep("sample text",6))
)

You can accomplish this using a combination of unite from tidyr and str_remove_all from stringr:

tidyr::unite(df, text, sep = ",", remove = FALSE) %>% # pasting the columns together
  mutate(new_text = stringr::str_remove_all(text, ",0|[0-9]+,")) %>%  # getting rid of the IDs and the 0s
  select(-text) # we don't need this column anymore

  ID result__1 result__2 result__3 result__4 result__5                       new_text
1  1     First         0         0         0         0              First,sample text
2  3    Second         0     First         0         0       Second,First,sample text
3  4     First         0     First         0         0        First,First,sample text
4  5     Third         0     First    Second         0 Third,First,Second,sample text
5  6    Second         0     First         0     Other Second,First,Other,sample text
6  7    Second         0     First         0     Other Second,First,Other,sample text

Where the new_text column is your desired output. sample text is what I used to represent the interviewees text column you talked about.

because there are several rows (and not only 6, as I wrote in the example) with different text I don't want to specify the text. How can I do it? furthermore, I want when having the "other" to go to another column which is the "result_specify" where the text is, and keep that text only. For example for the ID 6 "Second, First, whatever he has written" — Meli, Oct 11 '21 at 14:38
I'm a bit confused. I created the ```text``` column to represent whatever the interviewee put in as text. Are you saying you also want to remove whatever is in the ```result__5``` column from the the ```new_text``` column and then also put that in its own column? — NovaEthos, Oct 11 '21 at 19:33

Merging multiple columns based on conditions in R

1 Answers1