0

I have a column of my dataframe that is made up of the following:

df['Year] = [2025, 2024, NaN, 2023, 2026, NaN] (these are type float64)

How can I convert these years to something in datetime format? Since there are no months or days included I feel like they have to output as [01-01-2025, 01-01-2021, NaT, 01-01-2023, 01-01-2026, NaT] by default.

But if there was a way to still have the column as [2025, 2024, NaT, 2023, 2026, NaT] then that would work well too.

Using df['Year'] = pd.DatetimeIndex(df['Year']).year just output [1970, 1970, NaN, 1970, 1970, NaN].

Thank you very much.

user4740374
  • 109
  • 4
  • Does this answer your question? [How to convert string to datetime format in pandas python?](https://stackoverflow.com/questions/32204631/how-to-convert-string-to-datetime-format-in-pandas-python) – Jonas Palačionis Jul 21 '22 at 16:53
  • The problem is that my column of float does not include anything for DD or MM. It is just a column of years. – user4740374 Jul 21 '22 at 16:55

2 Answers2

2

You can use pandas' to_datetime() and set errors='coerce' to take care of the NaNs (-> NaT)

df['Year'] = pd.to_datetime(df['Year'], format='%Y', errors='coerce')

The output is going to be like 01-01-2025, 01-01-2021 ...

Ignatius Reilly
  • 1,594
  • 2
  • 6
  • 15
0

Probably not the most elegant solution but if you convert the column to string and fill the empty with a dummy year (say 1900) you can use parser from dateutil

from dateutil import parser

('01/01/'+df['year']).fillna('1900').apply(parser.parse)

Out[67]:

0   2025-01-01
1   2024-01-01
2   1900-07-21
3   2023-01-01
4   2026-01-01
5   1900-07-21
blackraven
  • 5,284
  • 7
  • 19
  • 45
  • It seems to work, but ideally we want the NaN to stay as NaN or NaT. – user4740374 Jul 21 '22 at 17:06
  • Hi, could you please format your code better? [This article may be helpful to you.](https://stackoverflow.com/help/formatting) It will improve the readability of your answer! – SAL Jul 22 '22 at 17:16