0

I am trying to do my investigation using WHO dataset on the suicide rates. Dataset has the column 'sex' with rows 'male' and 'female'. However, without the row "both". So the task is to sum data in 'suicides_no' based on the rows 'female' and male'. E.g just sum female and male rows with respect to year.

Example input:

       date     sex  suicides_no
31332  1985  female        296.0
31338  1985    male        678.0

Example output:

       date   sex    suicides_no
31332  1985  both        974

or

       date     suicides_no
31332  1985         974
harryhow
  • 57
  • 7

1 Answers1

0

Try this:

df = df.groupby(by=['date'], as_index=False).sum()
print(df)

   date  suicides_no
0  1985        974.0

Or this:

df = df.groupby(by=['date'], as_index=False).agg({'sex': lambda x: 'both', 'suicides_no': sum})
print(df)

   date   sex  suicides_no
0  1985  both        974.0
NYC Coder
  • 7,424
  • 2
  • 11
  • 24