0

I'm trying to convert SPSS timestamps to human readable timestamps such as

 data['Completion_date'] = pd.to_datetime(
     data['Completion_date']/86400, unit='D',
     origin=pd.Timestamp("1582-10-14"))

but get

Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "pandas/_libs/tslibs/timestamps.pyx", line 644, in pandas._libs.tslibs.timestamps.Timestamp.__new__
  File "pandas/_libs/tslibs/conversion.pyx", line 275, in pandas._libs.tslibs.conversion.convert_to_tsobject
  File "pandas/_libs/tslibs/conversion.pyx", line 470, in pandas._libs.tslibs.conversion.convert_str_to_tsobject
  File "pandas/_libs/tslibs/conversion.pyx", line 439, in pandas._libs.tslibs.conversion.convert_str_to_tsobject
  File "pandas/_libs/tslibs/np_datetime.pyx", line 121, in pandas._libs.tslibs.np_datetime.check_dts_bounds
pandas._libs.tslibs.np_datetime.OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 1582-10-14 00:00:00

Example: 13725072000 should convert to 2017-09-18

Dates in SPSS are recorded in seconds since October 14, 1582, the date of the beginning of the Julian calendar.

How else would I do it?

Prune
  • 76,765
  • 14
  • 60
  • 81
km6
  • 2,191
  • 2
  • 15
  • 18

2 Answers2

2

From the docs:

Since pandas represents timestamps in nanosecond resolution, the time span that can be represented using a 64-bit integer is limited to approximately 584 years:

In [68]: pd.Timestamp.min 
Out[68]: Timestamp('1677-09-21 00:12:43.145225')

In [69]: pd.Timestamp.max 
Out[69]: Timestamp('2262-04-11 23:47:16.854775807')

Your date is earlier than the earliest possible timestamp.

One way you might want to solve this is by adjusting your dates to instead be based on the to_datetime default unix time, which begins 141,428 days after the beginning of the Julian calendar (thanks, Google!). Then you can do:

data['Completion_date'] = pd.to_datetime((data['Completion_date']/86400)-141428, unit='D')
asongtoruin
  • 9,794
  • 3
  • 36
  • 47
0
/* SPSS SYNTAX CODE WITH EXAMPLES TO UNDERSTAND AND USE.  */.

DATA LIST LIST (",")/ MIS_FECHAS(DATETIME).

BEGIN DATA
15/OCT/1582 0:00:00 ,        /* ES RARO, PERO FECHAS ANTERIORES A ESTA NO SON VALIDAS. LA CIFRA MINIMA DE FECHA EN SPSS ES 86400 SECONDS.  */.

01/JAN/1970 0:00:00 ,

01/JAN/2017 0:00:00 ,

03/NOV/2021 1:35:00 ,

END DATA.



COMPUTE NUMERIC_SPSS =MIS_FECHAS*1.
EXECUTE. 


COMPUTE NUMERIC_UNIX_SEC = NUMERIC_SPSS  - 12219379200.
EXECUTE. 

COMPUTE NUMERIC_UNIX_MILISEC = (NUMERIC_SPSS  - 12219379200) * 1000.
EXECUTE. 


COMPUTE NUMERIC_SPSS_FROM_UNIX_SECS = NUMERIC_UNIX_SEC  + 12219379200.
EXECUTE. 


COMPUTE NUMERIC_SPSS_FROM_UNIX_MILISECS = (NUMERIC_UNIX_MILISEC/1000)  + 12219379200.
EXECUTE. 

/**********/.

If you want to use an spss date in python, you have to convert it first to UNIX format.

I think this change in your code in Python could work properly (not tested):

 data['Completion_date'] = pd.to_datetime(
     (data['Completion_date']-12219379200)/86400, unit='D',
     origin=pd.Timestamp("1582-10-14"))

I'm not an expert in python, but the idea is just subtract 12219379200 from spss numeric date, which is the number of seconds from 1582-10-14 (SPSS "origin") to 1970-01-01 (PYTHON origin). The result is a number in UNIX and python standard format.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
  • 1
    As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Nov 04 '21 at 05:56