0

In python3 and pandas I have a dataframe with dates as an integer that represents the amount of milliseconds since Jan 1, 1970 00:00:00 UTC

presidentes["Data distribuição"].head()
0    1367193600000.0
1    1252886400000.0
2    1063929600000.0
3    1196294400000.0
4    1254873600000.0
Name: Data distribuição, dtype: object

I want to convert to datatype timestamp and tried like this

presidentes['Data distribuição'] = pd.to_datetime(presidentes['Data distribuição'], unit='s')

---------------------------------------------------------------------------
OverflowError                             Traceback (most recent call last)
pandas/_libs/tslib.pyx in pandas._libs.tslib.array_with_unit_to_datetime()

pandas/_libs/tslibs/timedeltas.pyx in pandas._libs.tslibs.timedeltas.cast_from_unit()

OverflowError: Python int too large to convert to C long

During handling of the above exception, another exception occurred:

OutOfBoundsDatetime                       Traceback (most recent call last)
<ipython-input-10-53c23a37f5b2> in <module>
----> 1 presidentes['Data distribuição'] = pd.to_datetime(presidentes['Data distribuição'], unit='s')

~/Documentos/Code/publique_se/lib/python3.6/site-packages/pandas/core/tools/datetimes.py in to_datetime(arg, errors, dayfirst, yearfirst, utc, box, format, exact, unit, infer_datetime_format, origin, cache)
    590         else:
    591             from pandas import Series
--> 592             values = convert_listlike(arg._values, True, format)
    593             result = Series(values, index=arg.index, name=arg.name)
    594     elif isinstance(arg, (ABCDataFrame, compat.MutableMapping)):

~/Documentos/Code/publique_se/lib/python3.6/site-packages/pandas/core/tools/datetimes.py in _convert_listlike_datetimes(arg, box, format, name, tz, unit, errors, infer_datetime_format, dayfirst, yearfirst, exact)
    201         arg = getattr(arg, 'values', arg)
    202         result = tslib.array_with_unit_to_datetime(arg, unit,
--> 203                                                    errors=errors)
    204         if box:
    205             if errors == 'ignore':

pandas/_libs/tslib.pyx in pandas._libs.tslib.array_with_unit_to_datetime()

OutOfBoundsDatetime: cannot convert input 1367193600000.0 with the unit 's'

Please is there any other way to perform this conversion?

Reinaldo Chaves
  • 965
  • 4
  • 16
  • 43
  • 1
    `pd.to_datetime(df[1].astype('int64') * 1e6)` or `df[1].astype('datetime64[ms]')`? – Quang Hoang Sep 09 '19 at 19:27
  • Thank you @Quang Hoang The first method - presidentes['Data distribuição'] = pd.to_datetime(presidentes['Data distribuição'].astype('int64') * 1e6) - appears ValueError: invalid literal for int () with base 10: '1367193600000.0 ' – Reinaldo Chaves Sep 09 '19 at 19:34
  • The second method - presidentes['Data distribuição'] = presidentes['Data distribuição'].astype('datetime64[ms]') - appears OverflowError: signed integer is greater than maximum – Reinaldo Chaves Sep 09 '19 at 19:34
  • 1
    Your data seem to be string with trailing spaces. How about `pd.to_numeric(df[1]).astype('datetime64[ms]')`? – Quang Hoang Sep 09 '19 at 19:39
  • Thanks a lot, it seems like it worked now – Reinaldo Chaves Sep 09 '19 at 20:00
  • Possible duplicate of [How do I create a datetime in Python from milliseconds?](https://stackoverflow.com/questions/748491/how-do-i-create-a-datetime-in-python-from-milliseconds) – MattR Sep 09 '19 at 21:24

1 Answers1

4

Changing to unit='ms' seems to work:

presidentes['Data distribuição'] = pd.to_datetime(presidentes['Data distribuição'], unit='ms')
presidentes['Data distribuição'].head()

0   2013-04-29
1   2009-09-14
2   2003-09-19
3   2007-11-29
4   2009-10-07
wkzhu
  • 1,616
  • 13
  • 23