1

Here is some example data:

mydf = {'Month': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12],
        'Freq': [5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60]
        }
my_df = pd.DataFrame(mydf, columns=['Month', 'Freq'])
my_df

  Month Freq
0   1   5
1   2   10
2   3   15
3   4   20
4   5   25
5   6   30
6   7   35
7   8   40
8   9   45
9   10  50
10  11  55
11  12  60

How can I create a new dataframe which groups the months into seasons and find the total sum of each season frequency, while the output is still a dataframe?

I would like something like this: (Winter is where Month = 12, 1, 2)(Spring is where Month = 3, 4, 5)(etc....)

   Season Freq
0  Winter 75
1  Spring 60
2  Summer 105
3  Autumn 150

I have tried to select the rows and concatenate them to start with but I keep getting errors unfortunately.

bluesky
  • 153
  • 8
  • Related to [this question](https://stackoverflow.com/questions/60285557/extract-seasons-from-datetime-pandas/60285720#60285720) – Quang Hoang Dec 11 '20 at 18:05

3 Answers3

4

One of the easiest way would be to create a month to season mapper and then use map function from panda

season_map = {1: 'Winter', 2: 'Winter', 3: 'Spring', 4: 'Spring', 5: 'Spring', 6: 'Summer', 7: 'Summer', 8: 'Summer', 9:'Autumn', 10:'Autumn', 11: 'Autumn', 12: 'Winter'}
my_df.loc[:, 'season'] = my_df.Month.map(season_map)
my_df.groupby('season').freq.sum()

If you don't want to manually create the mapper, you can use this answer: Python: Datetime to season

Pawan
  • 1,066
  • 1
  • 10
  • 16
3

You can create a new column with seasons and group on that column:

my_df['Season']=df['Month'].apply(lambda x: 'Winter' if x in (12,1,2) else 'Spring' if x in (3,4,5) else 'Summer' if x in (6,7,8) else 'Autumn')

res=my_df.groupby('Season')['Freq'].sum()

>>> print(res)

Season
Autumn    150
Spring     60
Summer    105
Winter     75
IoaTzimas
  • 10,538
  • 2
  • 13
  • 30
0

This is not as stylish as it should be (due to the big amount of if statements) but it works:

    import pandas as pd

mydf = {'Month': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12],
        'Freq': [5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60]}
my_df = pd.DataFrame(mydf, columns=['Month', 'Freq'])

winter_counter = 0
spring_counter = 0
summer_counter = 0
autumn_counter = 0

for i in range(len(my_df)):
    #print(my_df.at[i,'Month'])
    if(my_df.at[i,'Month'] == 12 or my_df.at[i,'Month'] == 1 or my_df.at[i,'Month'] == 2):
        winter_counter = winter_counter + my_df.at[i,'Freq']
    if(my_df.at[i,'Month'] == 3 or my_df.at[i,'Month'] == 4 or my_df.at[i,'Month'] == 5):
        spring_counter = spring_counter + my_df.at[i,'Freq']
    if(my_df.at[i,'Month'] == 6 or my_df.at[i,'Month'] == 7 or my_df.at[i,'Month'] == 8):
        summer_counter = summer_counter + my_df.at[i,'Freq']
    if(my_df.at[i,'Month'] == 9 or my_df.at[i,'Month'] == 10 or my_df.at[i,'Month'] == 11):
        autumn_counter = autumn_counter + my_df.at[i,'Freq']

data_for_result = {
    'Season': ['Winter','Spring','Summer','Autumn'],
    'Freq': [winter_counter, spring_counter, summer_counter, autumn_counter],
}
my_result = pd.DataFrame(data_for_result,columns = ['Season','Freq'])
print(my_result)

In case you need an explanaition:

.at: Accesses a singular value [row,columnName], I use it first to see what Season this row belongs to, then to access freq in order to add it to its correspondant counter

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.at.html#pandas.DataFrame.at