6

Is there any way to import SPSS dataset into Python, preferably NumPy recarray format? I have looked around but could not find any answer.

Joon

joon
  • 3,899
  • 1
  • 40
  • 53

7 Answers7

3

Option 1 As rkbarney pointed out, there is the Python savReaderWriter available via pypi. I've run into two issues:

  1. It relies on a lot of extra libraries beyond the seemingly pure-python implementation. SPSS files are read and written in nearly every case by the IBM provided SPSS I/O modules. These modules differ by platform and in my experience "pip install savReaderWriter" doesn't get them running out of the box (on OS X).
  2. Development on savReaderWriter is, while not dead, less up-to-date than one might hope. This complicates the first issue. It relies on some deprecated packages to increase speed and gives some warnings any time you import savReaderWriter if they're not available. Not a huge issue today but it could be trouble in the future as IBM continues to update the SPSS I/O modules to deal new SPSS formats (they're on version 21 or 22 already if memory serves).

Option 2 I've chosen to use R as a middle-man. Using rpy2, I set up a simple function to read the file into an R data frame and output it again as a CSV file which I subsequently import into python. It's a bit rube-goldberg but it works. Of course, this requires R which may also be a hassle to install in your environment (and has different binaries for different platforms).

mgojohn
  • 881
  • 9
  • 15
  • John, would you be so kind as to post (in a GitHub gist or at pastebin.com) the code in your rpy2-based approach? I'm struggling with this issue http://stackoverflow.com/q/36287936/1389110, and your approach may help. – Pyderman Mar 29 '16 at 15:56
3

SPSS has an extensive integration with Python, but that is meant to be used with SPSS (now known as IBM SPSS Statistics). There is an SPSS ODBC driver that could be used with Python ODBC support to read a sav file.

Jon Peck
  • 191
  • 2
  • Or you could just save it in whatever format you like using python from inside SPSS, I assume? Both solutions require that ‘joon’ has access to SPSS though (which is quite expensive AFAIK). – JanC Sep 04 '10 at 18:44
2

gretl claims to import SPSS and export in a variety of formats, as does the R statistical suite. I've never dealt with SPSS data so cannot speak to their relative merits.

msw
  • 42,753
  • 9
  • 87
  • 112
2

You could have Python make an external call to spssread, a Perl script that outputs the content of SPSS files in the way you want.

Nicolas Raoul
  • 58,567
  • 58
  • 222
  • 373
1

Maybe this will help: Python reader + writer for spss sav files (Linux, Mac & Windows) http://code.activestate.com/recipes/577811-python-reader-writer-for-spss-sav-files-linux-mac-/

rkbarney
  • 38
  • 4
  • Thanks much! This is the kind of things that I was looking for .. it looks very useful. – joon Jan 12 '13 at 02:45
1

To be clear, the SPSS ODBC driver does not require an SPSS installation.

Jon Peck
  • 191
  • 2
1

Maybe this will be helpful for someone:

http://sourceforge.net/search/?q=python+SPSS

good luck!

Michal

Michal
  • 11
  • 1