0

I have a dataset with the number of suicides and population broken down by age group for each year and country, including NA values for some countries/years.

I would like to sum the number of suicides and populations across age groups for each year and country but leave NA if a given country has no data for a given year.

Input would look like:

country = c("Albania", "Albania", "Croatia", "Croatia", "Croatia", "Croatia")
year = c("1991", "1991", "1991", "1991", "1992", "1992"
suicides_no = c(NA, NA, 5, 3, 12, 9)
population = c(100, 200, 50, 75, 250, 300)
df = data.frame(country, year, suicides_no, population

and the output columns would be:

country year suicides_no population
Albania 1991 NA          300
Croatia 1991 8           125
Croatia 1992 21          550

2 Answers2

0

Groupby country, year and run a summary of groups, like below

  df %>% 
group_by(country,year) %>% 
summarize(suicides_no_sum = sum(suicides_no), population_sum = sum(population))

Should give:

 country year  suicides_no_sum population_sum
  <chr>   <chr>           <dbl>          <dbl>
1 Albania 1991               NA            300
2 Croatia 1991                8            125
3 Croatia 1992               21            550
JineshEP
  • 738
  • 4
  • 7
  • Hi thanks Jinesh. This is giving me just one row output that says NA in suicide_no and a huge number in population. Also how do I save this into my existing dataframe? – engelbrekt Nov 15 '20 at 11:50
  • @engelbrekt Just assign the result to ur dataframe , df= df %>% group_by...... – JineshEP Nov 15 '20 at 11:54
  • Hey, that didn't work my dataframe is now just 'data.frame': 1 obs. of 2 variables: $ suicides_no_sum: num NA $ population_sum : num 975 – engelbrekt Nov 15 '20 at 11:56
0

Using across function:

library(dplyr)
df <- df %>% group_by(country, year) %>% summarise(across(suicides_no:population, sum))
`summarise()` regrouping output by 'country' (override with `.groups` argument)
df
# A tibble: 3 x 4
# Groups:   country [2]
  country year  suicides_no population
  <chr>   <chr>       <dbl>      <dbl>
1 Albania 1991           NA        300
2 Croatia 1991            8        125
3 Croatia 1992           21        550
Karthik S
  • 11,348
  • 2
  • 11
  • 25