So I have two datasets with one having a value ID and say variables a,b and c; the other ones alo has IDs but with variables c,d,e. These two datasets have individuals in common but not all of them. I tried a dyplr::left join but it removes the rows from the second datset that do not match the ones in the first. So I just created and ordered columns in both datasets so that I can just do a rbind. Problem is I now have somethhing like this (example with made up data):
index b c d
1 A B A A
2 B NA C D
3 B B NA NA
4 C D E C
5 D D D D
6 E NA E NA
7 E NA NA F
8 E G NA NA
9 F F F F
and I would like to have:
index b c d
1 A B A A
2 B B C D
3 C D E C
4 D D D D
5 E G E F
6 F F F F
Apparently, a similar question has been asked in Push up and tighten Dataframe. General solution, but it was wih numbers, the sum function does not work in my case. I have tried adapting it replacing by paste and paste0 but it does not work. I also tried the second solution in this Merge rows in one data.frame (setDT+ lapply) it works for the example but I have "Error in eval(bysub, x, parent.frame()) : object 'A' not found" in the true dataset.
The solution as df1 %>% group_by(index) %>% summarise_all(na.omit)
with dplyr proposed in the first comment also works in the example but not on true data because of the error: "Error in summarise_impl(.data, dots) : Column Nom must be length 1 (a summary value), not 2" with "Nom" being the first column after the index made of strings of several words with spaces and punctuation. Doing length(Nom)
outputs the number of lines.
Thanks in advance