5

I have not worked with SPSS (.sav) files before and am trying to work with some data files provided to me by importing them into R. I did not receive any explanation of the files, and because communication is difficult I am trying to figure out as much as I can on my own.

Here's my first question. This is what the Date field looks like in an R data frame after import:

> dataset2$Date[1:4]
[1] 13608172800 13608259200 13608345600 13608345600

I don't know what dates the data is supposed to be for, but I found that if I divide the above numbers by 10, that seems to give a reasonable date (in February 2013). Can anyone confirm this is indeed what the above represents?

My second question is regarding another column called Begin_time. Here's what that looks like:

> dataset2$Begin_time[1:4]
[1] 29520 61800 21480 55080

Any idea what this is representing? I want to believe this is some representation of time of day because the records are for wildlife observations, but I haven't got more info than that to try to guess. I noticed that if I take the difference between End_Time and Begin_time I get numbers like 120 and 180, which seems like minutes to me (3 hours seems reasonable to observe a wild animal), but the absolute numbers are far greater than the number of minutes in a day (1440), so that leaves me puzzled. Is this some time keeping format from SPSS? If so, what's the logic?

Unfortunately, I don't have access to SPSS, so any help would be much appreciated.

helloB
  • 3,472
  • 10
  • 40
  • 87
  • 1
    Not an SPSS user myself, but maybe `spss.get` from package `Hmisc` helps? – erc Jun 17 '16 at 12:16
  • @beetroot I did not know about that package. Thanks! Yes it would definitely be useful to see if I get more inuitive output loading with another package. The load above did give me a few warnings, although I couldn't decipher which warnings went with whcih columns. – helloB Jun 17 '16 at 12:22
  • 2
    You may check the package [haven](https://github.com/hadley/haven). From [the first release notes](https://blog.rstudio.org/2015/03/04/haven-0-1-0/): "Dates are converted in to `Date`s, and datetimes to `POSIXct`s.". See also the ["Dates and times" vignette of the package](https://cran.r-project.org/web/packages/haven/vignettes/datetimes.html), which describes the formats of SPSS; "Dates and date times use a difference offset to R" – Henrik Jun 17 '16 at 13:02

2 Answers2

12

I had the same problem and this function is a good solution:

pss2date <- function(x) as.Date(x/86400, origin = "1582-10-14")

This is where I found the answer:

http://scs.math.yorku.ca/index.php/R:_Importing_dates_from_SPSS

Adam Richardson
  • 2,518
  • 1
  • 27
  • 31
Diaa Al mohamad
  • 131
  • 1
  • 4
  • Explanation: x/86400 gives the number of days (there are 60*60*24 = 86400 second in a day) since October 14th 1582 (which is the starting day of the Gregorian calender) – Marjolein Fokkema Feb 22 '21 at 23:45
  • Note that all dates in SPSS Statistics are actually datetimes, so there may be a part representing the time. For date-only values, the time part would be zero. – JKP Jul 03 '21 at 13:10
3

Dates in SPSS Statistics are represented as floating point doubles holding the number of seconds since Oct 1, 1582. If you use the SPSS R plugin apis, they can be automatically converted to R dates, but any proper converter should be able to do this for you.

JKP
  • 5,419
  • 13
  • 5