0

I have a dataframe like this:

enter image description here

I want to get the sum of each colum [Gliders, Helicopters, Jet engines, Piston engines, Turbo-propellers] for every state each year. So at the end I will get for each REF_DATE I will get only one row for each different state.

Ex: For REF_DATE=2012 if there are 23 states I will get 23 rows with the same REF_DATE (2012), and so on for each REF_DATE.

The condition should be something like:

if((REF_DATE==REF_DATE)&&(State==State)):
  sum(Gliders)
  sum(Helicopters)
  sum(...)

It will look something like this:

enter image description here

As @MattPitkin pointed, tt works fine with groupby, the answer was:

df_states = df_ap.groupby(['REF_DATE', 'State']).sum()

enter image description here

yaviens
  • 25
  • 7
  • for some reason your image isn't visible. – Sifat Haque Jan 11 '23 at 09:10
  • 2
    See the pandas [`groupby`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.groupby.html) method, e.g., `df.groupby(["State", "REF_DATE"]).sum()` – Matt Pitkin Jan 11 '23 at 09:13
  • Thanks @MattPitkin It worked just fine with groupby, the answer was: df_states = df_ap.groupby(['REF_DATE', 'State']).sum() – yaviens Jan 11 '23 at 09:16

0 Answers0