0

I have a df like this:

Year        2016    2017    
Month               
1       0.979000    1.109000    
2       0.974500    1.085667    
3       1.004000    1.075667    
4       1.027333    1.184000    
5       1.049000    1.089000    
6       1.013250    1.085500    
7       0.999000    1.059000    
8       0.996667    1.104000    
9       1.024000    1.121333    
10      1.019000    1.126333    
11      0.949000    1.183000    
12      1.074000    1.203000    

How can I add a 'Season' column that populates "Spring", "Summer" etc. based on the numerical value of month? E.g months 12, 1, and 2 = Winter, etc?

jros112
  • 151
  • 1
  • 5
  • 3
    kindly post your expected output. Looking at your data from face value, a `map` could do it for you, with a dictionary. ``df.assign(season = df.Month.map({1:'Spring',2:'Winter', ...})`` – sammywemmy Sep 26 '21 at 23:51
  • It looks like Month is the `index` not a column is that correct? – Henry Ecker Sep 27 '21 at 00:08

2 Answers2

0

You could use np.select with pd.Series.between:

import numpy as np
df["Season"] = np.select([df["Month"].between(3, 5), 
                          df["Month"].between(6, 8),
                          df["Month"].between(9, 11)], 
                          ["Spring", "Summer", "Fall"], 
                          "Winter")

    Month      2016      2017  Season
0       1  0.979000  1.109000  Winter
1       2  0.974500  1.085667  Winter
2       3  1.004000  1.075667  Spring
3       4  1.027333  1.184000  Spring
4       5  1.049000  1.089000  Spring
5       6  1.013250  1.085500  Summer
6       7  0.999000  1.059000  Summer
7       8  0.996667  1.104000  Summer
8       9  1.024000  1.121333    Fall
9      10  1.019000  1.126333    Fall
10     11  0.949000  1.183000    Fall
11     12  1.074000  1.203000  Winter
not_speshal
  • 22,093
  • 2
  • 15
  • 30
-1

You could iterate through the column, appending data to a new data frame which you will add in as a column.

for i in df['Year Month'] :
    if i == 12 or 1 or 2 :
        i = "Winter"
        df2.append(i)

Then add on your other conditions with elif and else statements and you should be good to add it onto your main df afterwards. Lemme know if this helps.

  • 1
    You [shouldn't](https://stackoverflow.com/a/55557758/9857631) iterate over a DataFrame when there are vectorized solutions available. As an aside, you are overwriting your `for` loop variable `i` within the loop (with "Winter"). And your condition should be `if i==12 or i==1 or i==2`. – not_speshal Sep 27 '21 at 00:04
  • 1
    `i == 12 or 1 or 2` is always `True` [Why does `a == x or y or z` always evaluate to True?](https://stackoverflow.com/q/20002503/15497888) – Henry Ecker Sep 27 '21 at 00:05
  • @not_speshal thank you I was not aware of that – Joseph Hogan Sep 27 '21 at 18:53
  • 1
    @JosephHogan - All good. PS - I wasn't the downvoter (I also got downvoted) – not_speshal Sep 27 '21 at 18:55