How to add a column of aggregated values in a dataframe?

Question

I want to add a column to my dataframe that displays frequency sums based on age group so I can calculate percentages as an additional column afterward. Right now I have two dataframes, the one I want to work with

Residential Status	age_group	frequency
1	50-59	5327
1	60-69	1962
1	70-79	224
1	80-85	16
2	50-59	1260
2	60-69	1176
2	70-79	428
2	80-85	75
...

and the one that has the aggregate values.

age_group	group total
50-59	117812
60-69	71868
70-79	18796
80-85	6310

I want it to look like this:

Residential Status	age_group	frequency	group total
1	50-59	5327	117812
1	60-69	1962	71868
1	70-79	224	18796
1	80-85	16	6310
2	50-59	1260	117812
2	60-69	1176	71868
2	70-79	428	18796
2	80-85	75	6310

I have tried using merge(), but it's literally adding the second dataframe on top of the first. I also tried to use summarise(), but that didn't work either. Any ideas?

`merge(your_first_data_frame, your_second_data_frame, by = "age_group")` should work just fine. Or with `dplyr` `first_data_frame |> left_join(second_data_frame)`. It would be nice to see what code you tried that didn't work. Perhaps your `age_group` columns are different classes in the different data frames? Maybe `factor` in one and `character` in another? Or `factor` classes with different levels? If you convert them both to `character` class with `as.character()` that would solve that problem. — Gregor Thomas, Jul 05 '23 at 17:48
If you still have trouble after that, inspect the values closely in the `age_group` column to make sure you don't have unneeded white space like `"80-85 "`, or other irregularities. If you still have the issue, please edit your question to share the sample data with `dput()`, e.g., `dput(your_first_data_frame[1:10, ])` for the first 10 rows of one data frame. That will include all data structure information so we can inspect more closely. — Gregor Thomas, Jul 05 '23 at 17:49

How to add a column of aggregated values in a dataframe?

0 Answers0