pandas find which half year a date belongs to

Question

I have the following pandas data frame with a date column. How can I add a column specifying which half year the date belongs to?

Please never post screenshots of data, always post the actual data \[[How to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples)\]. Because here **we need to know the dtype of your 'date' column: string, object, Python datetime, pandas pd.Timestamp...** — smci, Jan 17 '22 at 09:13

jezrael · Accepted Answer · 2022-01-17T08:44:05.863

2

Convert dates to datetimes and then use numpy.where with compare for less or equal:

df['date'] = pd.to_datetime(df['date'], dayfirst=True)

df['half year'] = np.where(df['date'].dt.month.le(6), 'H1', 'H2')
print (df)
        date half year
0 1993-09-09        H2
1 1993-09-11        H2
2 1994-01-23        H1
3 1993-03-18        H1

Solution without numpy with change mask for greater like 6, add 1 and convert to strings:

df['date'] = pd.to_datetime(df['date'], dayfirst=True)

df['half year'] = 'H' + df['date'].dt.month.gt(6).add(1).astype(str)
print (df)
        date half year
0 1993-09-09        H2
1 1993-09-11        H2
2 1994-01-23        H1
3 1993-03-18        H1

edited Jan 17 '22 at 08:44

answered Jan 17 '22 at 08:41

jezrael

822,522
95
1,334
1,252

1

Thank you. Is it possible without numpy? – dInGd0nG Jan 17 '22 at 08:43
@dInGd0nG - answer was edited. – jezrael Jan 17 '22 at 08:47

score 1 · Answer 2 · answered Jan 17 '22 at 08:42

1

Try:

df['half year'] = 'H' + pd.to_datetime(df['date']).dt.month.floordiv(6).add(1).astype(str)
print(df)

# Output
         date half year
0  09-09-1993        H2
1  18-03-1993        H1

answered Jan 17 '22 at 08:42

Corralien

109,409
8
28
52

score 1 · Answer 3 · answered Jan 17 '22 at 08:43

This solution uses .apply method.

>>> import pandas as pd
>>>
>>> df = pd.DataFrame({'date': ['09-09-1993', '11-09-1993', '23-01-1994', '18-03-1993']})
>>>
>>> df['date'] = pd.to_datetime(df['date'], format='%d-%m-%Y')
>>> df
        date
0 1993-09-09
1 1993-09-11
2 1994-01-23
3 1993-03-18
>>>
>>> df['half year'] = df.date.dt.month.apply(lambda x: "H1" if x in range(0, 7) else "H2")
>>> df
        date half year
0 1993-09-09        H2
1 1993-09-11        H2
2 1994-01-23        H1
3 1993-03-18        H1

hpchavaz · Answer 4 · 2022-01-17T10:09:14.537

Without numpy, a rather general solution that could easily be modified to get semesters as 'first' or 'second':

build a pandas.DataFrame (not in the provided code)

(Code starts from here)*

convert the 'date' column to datetime
extract month as a pandas.Series from df['date'] using pandas.Series.dt.month
integer divide by 6 to get the semesters as a pandas.Series of int (counting from 0)
get the semester strings as a pandas.Series using pandas.Series.map with the semesters ints as indices on a semester names list
bundling (steps 2. to 4.) with pandas.Categorical to get categorical column (lower space, faster processing), this step is optional

>>> df['date'] = pd.to_datetime(df['date'])
>>> df['half year'] = pd.Categorical((df['date'].dt.month // 6).map(lambda h:['H1','H2'][h]))
>>> df
        date half year
0 1993-09-09        H2
1 1993-09-11        H2
2 1994-01-23        H1
3 1993-03-18        H1
>>> df.dtypes
date         datetime64[ns]
half year          category
dtype: object

pandas find which half year a date belongs to

4 Answers4