113

I have a column I_DATE of type string(object) in a dataframe called train as show below.

I_DATE
28-03-2012 2:15:00 PM
28-03-2012 2:17:28 PM
28-03-2012 2:50:50 PM

How to convert I_DATE from string to datetime format & specify the format of input string.

Also, how to filter rows based on a range of dates in pandas?

cottontail
  • 10,268
  • 18
  • 50
  • 51
GeorgeOfTheRF
  • 8,244
  • 23
  • 57
  • 80
  • the tl;dr: [pandas.to_datetime](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.to_datetime.html). Be aware though, not all formats are parsed correctly; you might want to have a look at keyword `dayfirst` or set a `format`. – FObersteiner Jun 30 '22 at 06:47

3 Answers3

181

Use to_datetime. There is no need for a format string since the parser is able to handle it:

In [51]:
pd.to_datetime(df['I_DATE'])

Out[51]:
0   2012-03-28 14:15:00
1   2012-03-28 14:17:28
2   2012-03-28 14:50:50
Name: I_DATE, dtype: datetime64[ns]

To access the date/day/time component use the dt accessor:

In [54]:
df['I_DATE'].dt.date

Out[54]:
0    2012-03-28
1    2012-03-28
2    2012-03-28
dtype: object

In [56]:    
df['I_DATE'].dt.time

Out[56]:
0    14:15:00
1    14:17:28
2    14:50:50
dtype: object

You can use strings to filter as an example:

In [59]:
df = pd.DataFrame({'date':pd.date_range(start = dt.datetime(2015,1,1), end = dt.datetime.now())})
df[(df['date'] > '2015-02-04') & (df['date'] < '2015-02-10')]

Out[59]:
         date
35 2015-02-05
36 2015-02-06
37 2015-02-07
38 2015-02-08
39 2015-02-09
wjandrea
  • 28,235
  • 9
  • 60
  • 81
EdChum
  • 376,765
  • 198
  • 813
  • 562
21

Approach: 1

Given original string format: 2019/03/04 00:08:48

you can use

updated_df = df['timestamp'].astype('datetime64[ns]')

The result will be in this datetime format: 2019-03-04 00:08:48

Approach: 2

updated_df = df.astype({'timestamp':'datetime64[ns]'})
Arjjun
  • 1,203
  • 16
  • 15
5

For a datetime in AM/PM format, the time format is '%I:%M:%S %p'. See all possible format combinations at https://strftime.org/. N.B. If you have time component as in the OP, the conversion will be done much, much faster if you pass the format= (see here for more info).

df['I_DATE'] = pd.to_datetime(df['I_DATE'], format='%d-%m-%Y %I:%M:%S %p')

To filter a datetime using a range, you can use query:

df = pd.DataFrame({'date': pd.date_range('2015-01-01', '2015-04-01')})
df.query("'2015-02-04' < date < '2015-02-10'")

or use between to create a mask and filter.

df[df['date'].between('2015-02-04', '2015-02-10')]
cottontail
  • 10,268
  • 18
  • 50
  • 51