1

I'm dealing with a dataset that has a column called "start_date". The data type of this column is an object.

The "start_date" column has some values like:

26/07/2020  9:00:00 AM
25/07/2020  11:06:00 AM
05/09/2020  6:11:00 PM
06/08/2020  1:36:00 AM

I need to extract the month and day for each row. To do this I used this:

df['Day'] = pd.DatetimeIndex(df['start_date']).day
df['Month'] = pd.DatetimeIndex(df['start_date']).month

The output should be something like this:

Day   Month
26     7
25     7
5      9
6      8

But it gave me this:

Day   Month
26     7
25     7
9      5
8      6

The problem is that, in the 3rd row, the day should be 5 and the month is 9. But it read it in the opposite way.

Similarly, in the 4th row, the day should be 6 and the month is 8. But it read it in the opposite way.

Any idea how to fix it?

desertnaut
  • 57,590
  • 26
  • 140
  • 166
MohammedE
  • 99
  • 2
  • 10

1 Answers1

0

Use dayfirst=True

With pandas.DatetimeIndex:

df['Day'] = pd.DatetimeIndex(df['start_date'], dayfirst=True).day
df['Month'] = pd.DatetimeIndex(df['start_date'], dayfirst=True).month

or, with pandas.to_datetime:

df['Day'] = pd.to_datetime(df['start_date'], dayfirst=True).dt.day
df['Month'] = pd.to_datetime(df['start_date'], dayfirst=True).dt.month

output:

                start_date  Day  Month
0   26/07/2020  9:00:00 AM   26      7
1  25/07/2020  11:06:00 AM   25      7
2   05/09/2020  6:11:00 PM    5      9
3   06/08/2020  1:36:00 AM    6      8

documentation:

dayfirstbool, default False

Specify a date parse order if arg is str or is list-like. If True, parses dates with the day first, e.g. "10/11/12" is parsed as 2012-11-10.

Warning

dayfirst=True is not strict, but will prefer to parse with day first. If a delimited date string cannot be parsed in accordance with the given dayfirst option, e.g. to_datetime(['31-12-2021']), then a warning will be shown.

mozway
  • 194,879
  • 13
  • 39
  • 75