0

You can use python to automate things in SPSS or to shorten the way, but I need to know if it is possible to replace the SPSS Syntax with python for example to aggregate data in loops etc..

Or another example. I have 2 datesets with the follwing variables id, begin, end and type. It is possible to put them into different arrays/lists and then compare the arrays/lists so that at the end i have a new table/dataset with non matching entries and a dataset with the matching entries in SPSS. My idea is to extend the context of matching files in SPSS.

Normally programming languages like python or php can handle this.

Excuse me. I hope someone will understand what I mean.

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
  • 1
    Which SPSS specifically do you use? Do you have a link? I'm also wondering why you say "you can automate processes in the SPSS with Python" and then "I want to replace the SPSS syntax with Python". How can both of these statements be true? – Aaron Digulla Mar 10 '14 at 12:46
  • I use the SPSS statistics package version 22. Python is part of the installation. As simple example for automate you can find here [link](http://www.spss-tutorials.com/suffix-all-variable-names/). – user2491338 Mar 10 '14 at 13:10
  • Replace the SPSS syntax means to use instead of "Match files" python code for matching. Like mentioned in the example with the two datasets. – user2491338 Mar 10 '14 at 14:11

2 Answers2

0

This question explains several ways how to import an SPSS dataset in Python code: Importing SPSS dataset into Python

Afterwards, you can use the standard Python tools to analyze them.

Note: I've had some success with simply formatting the data in a text file. I can then use any diff tool to compare the files.

The advantage of this approach is that's usually very easy to write text exporters which sort the data to make it easier for the diff tool to see what is similar.

The drawback is that text only works for simple cases. When your data has a recursive structure, then text is not ideal. In that case, try an XML diff tool.

Community
  • 1
  • 1
Aaron Digulla
  • 321,842
  • 108
  • 597
  • 820
  • Thank you for your answer. I don't only want to compare the files. Instead i want to match the entries by id and begin/end-date. So for example the courses in dataset 2 matching with the correct person in dataset 1 and his/her corresbonding period of time or the membership. This must be done row for row and if there are members with no courses or courses with no corresbonding member i've to take a closer look. – user2491338 Mar 11 '14 at 08:23
  • I understand; just keep in mind that the answers here are not only for you. – Aaron Digulla Mar 11 '14 at 08:31
0

There are many ways to do this sort of thing with Python. The SPSS module Dataset class allows you to read and write the case data. The spssdata module provides a somewhat simpler way to do this. These are included when you install the Python Essentials. There are also utility modules available from the SPSS Community website. In particular, the extended Transforms module provides a standard lookup function and an interval-based lookup.

I'm not sure, though, that the standard MATCH FILES won't do what you need here. Mismatches will generate missing data in the variables, and you can select subsets based on that criterion.

djhurio
  • 5,437
  • 4
  • 27
  • 48
JKP
  • 5,419
  • 13
  • 5
  • I think these hints seems to be useful and I've to try it. Therefore I need some time to test the module. Thanks. – user2491338 Mar 13 '14 at 08:43
  • The last problem described [here](https://www.ibm.com/developerworks/community/forums/html/topic?id=77777777-0000-0000-0000-000014897265) is very similar to mine. – user2491338 Mar 14 '14 at 08:00