5

I have a column ("ADMIT_YEAR") of integers in a dataframe. A typical element in the column looks like: 200110, where 2001 = year and 10 = month. I need to convert this column to type datetime.

I succeeded in doing so using the clunky method below. Could someone offer a more efficient way of writing this code?

Freshman['ADMIT_YEAR'] = Freshman['ADMIT_TERM'].astype(str).str.slice(0,4)
Freshman['ADMIT_MONTH'] = Freshman['ADMIT_TERM'].astype(str).str.slice(4,6)
Freshman['ADMIT_DATE_str'] = Freshman['ADMIT_YEAR']+'/'+Freshman['ADMIT_MONTH']
Freshman['ADMIT_DATE'] = pd.to_datetime(Freshman['ADMIT_DATE_str'], format="%Y/%m")

Note: I believe this question is not answered here since my dates are not integer days.

Community
  • 1
  • 1
James Eaves
  • 1,587
  • 3
  • 17
  • 22

1 Answers1

7

Just apply pd.to_datetime directly to (the string conversion of) the column, no need to use string slicing here:

Freshman['ADMIT_DATE'] = pd.to_datetime(Freshman['ADMIT_DATE'].astype(str), format='%Y%m')

There is no requirement for there to be a delimiter between the digits:

>>> import pandas as pd
>>> df = pd.DataFrame({'ADMIT_DATE': [200110, 201604]})
>>> df['ADMIT_DATE'] = pd.to_datetime(df['ADMIT_DATE'].astype(str), format='%Y%m')
>>> df
  ADMIT_DATE
0 2001-10-01
1 2016-04-01
>>> df.dtypes
ADMIT_DATE    datetime64[ns]
dtype: object
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343