24

I want to convert a date time series to season, for example for months 3, 4, 5 I want to replace them with 2 (spring); for months 6, 7, 8 I want to replace them with 3 (summer) etc.

So, I have this series

id
1       2011-08-20
2       2011-08-23
3       2011-08-27
4       2011-09-01
5       2011-09-05
6       2011-09-06
7       2011-09-08
8       2011-09-09
Name: timestamp, dtype: datetime64[ns]

and this is the code I have been trying to use, but to no avail.

# Get seasons
spring = range(3, 5)
summer = range(6, 8)
fall = range(9, 11)
# winter = everything else

month = temp2.dt.month
season=[]

for _ in range(len(month)):
    if any(x == spring for x in month):
       season.append(2) # spring 
    elif any(x == summer for x in month):
        season.append(3) # summer
    elif any(x == fall for x in month):
        season.append(4) # fall
    else:
        season.append(1) # winter

and

for _ in range(len(month)):
    if month[_] == 3 or month[_] == 4 or month[_] == 5:
        season.append(2) # spring 
    elif month[_] == 6 or month[_] == 7 or month[_] == 8:
        season.append(3) # summer
    elif month[_] == 9 or month[_] == 10 or month[_] == 11:
        season.append(4) # fall
    else:
        season.append(1) # winter

Neither solution works, specifically in the first implementation I receive an error:

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

While in the second is a large list with errors. Any ideas please? Thanks

DSM
  • 342,061
  • 65
  • 592
  • 494
Jespar
  • 1,017
  • 5
  • 16
  • 29
  • 3
    Aside: Python convention tends to use `_` only for variables that you don't intend to refer to later. Seeing `month[_]` is very strange to a Python reader. – DSM May 23 '17 at 01:50
  • Thanks for letting me know – Jespar May 23 '17 at 01:52
  • Possible duplicate of [Determine season given timestamp in Python using datetime](https://stackoverflow.com/questions/16139306/determine-season-given-timestamp-in-python-using-datetime) – iled Apr 08 '18 at 16:48

6 Answers6

49

You can use a simple mathematical formula to compress a month to a season, e.g.:

>>> [month%12 // 3 + 1 for month in range(1, 13)]
[1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 1]

So for your use-case using vector operations (credit @DSM):

>>> temp2.dt.month%12 // 3 + 1
1    3
2    3
3    3
4    4
5    4
6    4
7    4
8    4
Name: id, dtype: int64
AChampion
  • 29,683
  • 4
  • 59
  • 75
  • 5
    That is so brilliant – Jespar May 23 '17 at 01:46
  • 3
    Why use `apply` here instead of using vector operations directly? – DSM May 23 '17 at 01:51
  • To complete the answer : if you want to take into account solstices and equinoxes, you can do the following : `season = temp2.dt.month%12 // 3 + 1 # like in the original post` then remove a 'season' by applying a -1 for march, june, september and december `season.loc[(temp2.dt.month.isin((3, 6, 9, 12))) & (temp2.dt.day < 21)] -= 1` – pcotte Jun 17 '22 at 09:29
7

It's, also, possible to use dictionary mapping.

  1. Create a dictionary that maps a month to a season:

    In [27]: seasons = [1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 1]
    
    In [28]: month_to_season = dict(zip(range(1,13), seasons))
    
    In [29]: month_to_season 
    Out[29]: {1: 1, 2: 1, 3: 2, 4: 2, 5: 2, 6: 3, 7: 3, 8: 3, 9: 4, 10: 4, 11: 4, 12: 1}
    
  2. Use it to convert the months to seasons

    In [30]: df.id.dt.month.map(month_to_season) 
    Out[30]: 
    1    3
    2    3
    3    3
    4    4
    5    4
    6    4
    7    4
    8    4
    Name: id, dtype: int64
    

Performance: This is fairly fast

In [35]: %timeit df.id.dt.month.map(month_to_season) 
1000 loops, best of 3: 422 µs per loop
Mohamed Ali JAMAOUI
  • 14,275
  • 14
  • 73
  • 117
4

I think a more precise solution may be useful. If we have a month (1, ..., 12), we can convert it to season decreasing one and dividing by 3,

df = pd.Series(["2011-06-07", 
                "2011-08-23", 
                "2011-08-27", 
                "2011-09-01", 
                "2011-09-05", 
                "2011-09-06", 
                "2011-09-08", 
                "2011-12-25"])
 df = pd.to_datetime(df)

 season = (df.dt.month - 1) // 3

Therefore we will be mapping 1,2,3 to 0 (winter), 4,5,6 to 1 (spring), 7,8,9 to 2 (summer), and 10,11,12 to 3 (fall). However, we know the months 3,6,9, and 12 divide two seasons each. I propose the following approach:

If the month is 3 and the day is greater or equal 20, the season is spring, and we need to sum 1. If the month is 6 and the day is greater or equal 21, the season is summer, and we need to sum 1. If the month is 9 and the day is greater or equal 23, the season is fall, and we need to sum 1. If the month is 3 and the day is greater or equal 20, the season is winter, and we need to decrease 3 (or sum +1 in modulus 4). Then we have

season += (df.dt.month == 3)&(df.dt.day>=20)
season += (df.dt.month == 6)&(df.dt.day>=21)
season += (df.dt.month == 9)&(df.dt.day>=23)
season -= 3*((df.dt.month == 12)&(df.dt.day>=21)).astype(int)

The solution for this series will be [1,2,2,2,2,2,2,0].

2

I think this would work.

while True:
date=int(input("Date?"))
season=""
if date<4:
    season=1
elif date<7:
    season=2
elif date<10:
    season=3
elif date<13:
    season=4
else:
    print("This would not work.")
print(season)
John Hao
  • 39
  • 7
1
import pandas as pd
import datetime as dt

df = pd.DataFrame({'date': pd.date_range('2000-01-01', '2001-01-01', periods=12)})
seasons = {(1, 12, 2): 1, (3, 4, 5): 2, (6, 7, 8): 3, (9, 10, 11): 4}
df['m'] = df.date.dt.month

def season(ser):
    for k in seasons.keys():
        if ser in k:
            return seasons[k]

df['s'] = df.m.apply(seasons)
Out[25]: 
                            date   m  s
0  2000-01-01 00:00:00.000000000   1  1
1  2000-02-03 06:32:43.636363636   2  1
2  2000-03-07 13:05:27.272727273   3  2
3  2000-04-09 19:38:10.909090910   4  2
4  2000-05-13 02:10:54.545454546   5  2
5  2000-06-15 08:43:38.181818182   6  3
6  2000-07-18 15:16:21.818181820   7  3
7  2000-08-20 21:49:05.454545456   8  3
8  2000-09-23 04:21:49.090909092   9  4
9  2000-10-26 10:54:32.727272728  10  4
10 2000-11-28 17:27:16.363636364  11  4
11 2001-01-01 00:00:00.000000000   1  1
V Z
  • 119
  • 3
0

Here is my solution (not the best solution for leap years) if you want to convert date to season if you take in mind month and day in the month. I took arbitrary non-leap year:

import pandas as pd
df = pd.DataFrame({'Date': pd.date_range('2022-01-01', '2023-01-01', periods=12)})

winter_start = pd.to_datetime("2022-12-21", format = "%Y-%m-%d").dayofyear
spring_start = pd.to_datetime("2022-3-21", format = "%Y-%m-%d").dayofyear
summer_start = pd.to_datetime("2022-6-21", format = "%Y-%m-%d").dayofyear
autumn_start = pd.to_datetime("2022-9-23", format = "%Y-%m-%d").dayofyear

for index, date in df["Date"].items():
    if (date.dayofyear >= winter_start) or (date.dayofyear < spring_start):
        df.at[index, "Season"] = "Winter"
    elif (date.dayofyear >= spring_start) and (date.dayofyear < summer_start):
        df.at[index, "Season"] = "Spring"
    elif (date.dayofyear >= summer_start) and (date.dayofyear < autumn_start):
        df.at[index, "Season"] = "Summer"
    else:
        df.at[index, "Season"] = "Autumn"

    Out:
    Date                            Season
0   2022-01-01 00:00:00.000000000   Winter
1   2022-02-03 04:21:49.090909091   Winter
2   2022-03-08 08:43:38.181818182   Winter
3   2022-04-10 13:05:27.272727273   Spring
4   2022-05-13 17:27:16.363636364   Spring
5   2022-06-15 21:49:05.454545456   Spring
6   2022-07-19 02:10:54.545454546   Summer
7   2022-08-21 06:32:43.636363636   Summer
8   2022-09-23 10:54:32.727272728   Autumn
9   2022-10-26 15:16:21.818181820   Autumn
10  2022-11-28 19:38:10.909090912   Autumn
11  2023-01-01 00:00:00.000000000   Winter
VoV
  • 1