1

I have a data frame with several columns I want to merge only 5 of those.

ID result___1 result___2 result___3 result___4 result___5
1 First 0 0 0 0
3 Second 0 First 0 0
4 First 0 First 0 0
5 Third 0 First Second 0
6 Second 0 First 0 Other
7 Second 0 First 0 Other

The merge of those will be in a new columne, seperated by "," if needed. For example for the first raw I want to keep only "First", for the second raw "Second, First" etc. My last column has text where it is different each time (because it was filled by interviewees). So the results it could be "Second, First, the interviewees text".

Following <https://stackoverflow.com/questions/50874955/merging-multiple-columns-in-a-dataframe-based-on-condition-in-r >

This is my code until now. But I don't get what I want

df %>% mutate_at(vars(starts_with("result___")), funs(na_if(.,"0"))) %>%
  mutate(demo_race = coalesce(df$result___1,df$result___2,df$result___3,df$result___4,df$result___5)) %>%
  select(ID, df$resultsnew)

Any hint how can I solve my problem?

Thank you in advance

Meli
  • 33
  • 6
  • Did I answer your question below? If so, can you please click the green check mark next to it so that others know this question is closed? Thanks! – NovaEthos Oct 10 '21 at 17:53

1 Answers1

1

First, I recreated your data:

df <- data.frame(
  ID = c(1,3,4,5,6,7),
  result__1 = c("First", "Second", "First", "Third","Second", "Second"),
  result__2 = c(0,0,0,0,0,0),
  result__3 = c(0, "First","First","First","First","First"),
  result__4 = c(0,0,0,"Second",0,0),
  result__5 = c(0,0,0,0,"Other","Other"),
  text = c(rep("sample text",6))
)

You can accomplish this using a combination of unite from tidyr and str_remove_all from stringr:

tidyr::unite(df, text, sep = ",", remove = FALSE) %>% # pasting the columns together
  mutate(new_text = stringr::str_remove_all(text, ",0|[0-9]+,")) %>%  # getting rid of the IDs and the 0s
  select(-text) # we don't need this column anymore

  ID result__1 result__2 result__3 result__4 result__5                       new_text
1  1     First         0         0         0         0              First,sample text
2  3    Second         0     First         0         0       Second,First,sample text
3  4     First         0     First         0         0        First,First,sample text
4  5     Third         0     First    Second         0 Third,First,Second,sample text
5  6    Second         0     First         0     Other Second,First,Other,sample text
6  7    Second         0     First         0     Other Second,First,Other,sample text

Where the new_text column is your desired output. sample text is what I used to represent the interviewees text column you talked about.

NovaEthos
  • 500
  • 2
  • 10
  • because there are several rows (and not only 6, as I wrote in the example) with different text I don't want to specify the text. How can I do it? furthermore, I want when having the "other" to go to another column which is the "result_specify" where the text is, and keep that text only. For example for the ID 6 "Second, First, whatever he has written" – Meli Oct 11 '21 at 14:38
  • I'm a bit confused. I created the ```text``` column to represent whatever the interviewee put in as text. Are you saying you also want to remove whatever is in the ```result__5``` column from the the ```new_text``` column and then also put that in its own column? – NovaEthos Oct 11 '21 at 19:33