2

I have a dataframe float column as:

data = {'mydate': [23131.0,23131.0,np.nan,22677.0,22554.0,np.nan,23131.0]}

df = pd.DataFrame(data,columns=['mydate'])

         mydate 
0        23131.0              
1        23131.0              
2        NaN              
3        22677.0              
4        22554.0              
5        NaN              
6        23131.0              

It contains null values. I am trying to convert it to datetime python using the following code

def dayym(unit):
    dates = {date:((epoch + datetime.timedelta(days=date))) for date in unit.unique()}
    return unit.map(dates)

df.loc[:,'mydate']= dayym(df['mydate'])

with the following error:

    dates = {date:((epoch + datetime.timedelta(days=date))) for date in unit.unique()}
  File "central_read.py", line 18, in <dictcomp>
    dates = {date:((epoch + datetime.timedelta(days=date))) for date in unit.unique()}
ValueError: cannot convert float NaN to integer

Any ideas. I am out of them at this point.

flying_fluid_four
  • 704
  • 1
  • 6
  • 14
  • Please provide the expected [MRE - Minimal, Reproducible Example](https://stackoverflow.com/help/minimal-reproducible-example). Show where the intermediate results deviate from the ones you expect. We should be able to paste a single block of your code into file, run it, and reproduce your problem. This also lets us test any suggestions in your context. [Include your minimal data frame](https://stackoverflow.com/questions/52413246/how-to-provide-a-reproducible-copy-of-your-dataframe-with-to-clipboard) as part of the example. – Prune Apr 02 '21 at 01:59
  • data = {'mydate': [23131.0,23131.0,np.nan,22677.0,22554.0,np.nan,23131.0]} df = pd.DataFrame(data,columns=['mydate']) – flying_fluid_four Apr 02 '21 at 02:04
  • 1
    what does 23131 represent? Is it a relative number to another date? – Joe Ferndz Apr 02 '21 at 04:43
  • see [Convert Excel style date with pandas](https://stackoverflow.com/q/38454403/10197418) if it's excel time format – FObersteiner Apr 02 '21 at 09:46

3 Answers3

3

To convert a float to datetime and ignore np.nan values, you can use pd.to_datetime with errors='coerce'

import pandas as pd
import numpy as np
data = {'mydate': [23131.0,23131.0,np.nan,22677.0,22554.0,np.nan,23131.0]}
df = pd.DataFrame(data)
df['mydate'] = pd.to_datetime(df['mydate'], errors='coerce')
print (df)

This will give you:

                         mydate
0 1970-01-01 00:00:00.000023131
1 1970-01-01 00:00:00.000023131
2                           NaT
3 1970-01-01 00:00:00.000022677
4 1970-01-01 00:00:00.000022554
5                           NaT
6 1970-01-01 00:00:00.000023131
Joe Ferndz
  • 8,417
  • 2
  • 13
  • 33
1

Not sure what is epoch, so used 1900-01-01 in parameter origin, also is necessary add unit='d' fo days and errors='coerce' for convert missing or wrong values to NaT in to_datetime:

df['mydate'] = pd.to_datetime(df['mydate'], errors='coerce', unit='d', origin='1900-01-01')
print (df)    
      mydate
0 1963-05-02
1 1963-05-02
2        NaT
3 1962-02-02
4 1961-10-02
5        NaT
6 1963-05-02

If need epoch is 1970-01-01 is possible use:

df['mydate'] = pd.to_datetime(df['mydate'], errors='coerce', unit='d', origin='unix')

#default value, so should be removed
df['mydate'] = pd.to_datetime(df['mydate'], errors='coerce', unit='d')
print (df)    
      mydate
0 2033-05-01
1 2033-05-01
2        NaT
3 2032-02-02
4 2031-10-02
5        NaT
6 2033-05-01
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
0

Use .dropna() to drop them

df['mydate'] = df['mydate'].dropna().apply(daymm)
Wasif
  • 14,755
  • 3
  • 14
  • 34
  • dates = {date:((epoch + datetime.timedelta(days=date))) for date in unit.unique()} AttributeError: 'float' object has no attribute 'unique' – flying_fluid_four Apr 02 '21 at 02:10