I have next data table dataframe
library(dplyr)
library(data.table)
my_data = data.frame(
id = c(1, 1, 2, 2, 3),
sample_number = c('d1', 'rr1', 'd2', 'rr2', 'd3'),
res_1 = c('AA', NA, NA, 'GG', 'AG'),
res_2 = c(NA, 'TT', 'CC', NA, 'TC'),
res_3 = c('II', 'II', 'DD', 'ID', 'ID')
)
my_data <- my_data %>% as.data.table() ## convert to data table
> my_data
id sample_number res_1 res_2 res_3
1 1 d1 AA <NA> II
2 1 rr1 <NA> TT II
3 2 d2 <NA> CC DD
4 2 rr2 GG <NA> ID
5 3 d3 AG TC ID
Uniq column is id
. For some id
exists 2 rows with different values in sample_number
column. How can I unite rows by id column?
For id 2 in column res_3
mistake exists. In that case result of unite will by '---'. Result is next
id sample_number res_1 res_2 res_3
1 d1, rr1 AA TT II
2 d2, rr2 GG CC '---'
3 d3 AG TC ID