Grouping columns and summing column-wise to keep one line per groupement

Asked Jun 13 '23 at 13:32

Active Jun 13 '23 at 14:11

Viewed 19 times

I have the following data frame structure:

ID	conception_date	birth_date	med_1	med_2	med_3	med_4	...
A	xxxx	xxxx	1	0	0	0	...
A	xxxx	xxxx	0	1	0	0	...
A	xxxx	xxxx	0	0	1	0	...
B	xxxx	xxxx	1	0	0	0	...
B	xxxx	xxxx	0	1	0	0	...
B	xxxx	xxxx	0	0	1	0	...
B	xxxx	xxxx	0	0	0	1	...
C	xxxx	xxxx	1	0	0	0	...
C	xxxx	xxxx	0	0	0	1	...

I would like to group people by their ID, conception_date and birth_date in order to keep one line per person while summing medications per column per groupement. So the structure would become:

ID	conception_date	birth_date	med_1	med_2	med_3	med_4
A	xxxx	xxxx	1	1	1	0
B	xxxx	xxxx	1	1	1	1
C	xxxx	xxxx	1	0	0	1

edited Jun 13 '23 at 14:11

Greg

3,054
6
27

asked Jun 13 '23 at 13:32

Youknowme

Please provide better sample data and explain what you have tried thus far. – Chamkrai Jun 13 '23 at 13:41
Is the `N_person` row supposed to be a "total row"; and if so, why is it in the original, unsummarized dataset? And is `med_N` a rowwise sum of `med_1` through `med_4`, or is it just a placeholder, as if to say "there are `N` columns of the form `med_*`? – Greg Jun 13 '23 at 13:41
@Greg No N_person and N_med indicate that they go to Nth id and Nth medication as I have more than 4 medications and more than 3 IDs. I updated the data frame to include only 3 IDs and 4 medications to avoid confusion – Youknowme Jun 13 '23 at 13:46
`library(dplyr); your_data |> summarize(across(everything(), sum), .by = c(ID, conception_date, birth_date))` – Gregor Thomas Jun 13 '23 at 13:47

Grouping columns and summing column-wise to keep one line per groupement

0 Answers0