2

How do I read an RFC4180-standard CSV file into SPSS? Specifically, how to handle string values that have embedded double quotes which are (properly) escaped with a second double quote?

Here's one instance of a record with a problematic value:

2985909844,,3,3,3,3,3,3,1,2,2,"I recall an ad for ""RackSpace"", but I don't recall if this was here or in another page.",200,1,1,1,0,1,0,Often

The SPSS syntax I used is as follows:

GET DATA
  /TYPE=TXT
  /FILE="/Users/pieter/Work/Stackoverflow/2013_StackOverflowRecoded.csv"
  /IMPORTCASE=ALL
  /ARRANGEMENT=DELIMITED
  /DELCASE=LINE
  /FIRSTCASE=2
  /DELIMITERS=","
  /QUALIFIER='"'
  /VARIABLES=  ... list of column names...

The import succeeds, but gets off track and throws warnings after encountering such values.

prototype
  • 7,249
  • 15
  • 60
  • 94

2 Answers2

2

I'm afraid this is a bug in SPSS and therefore not possible to solve.

You might want to ask the IBM Support team about this issue and post their answer here, if you find it helpful.

One Workaround would be to change the escaped double quotes in your *.csv file(s) to some other quote type. This should be only little work if you use an advanced text editor such as notepad++ or the "sed" command line tool on UNIX like operation systems.

mirirai
  • 1,365
  • 9
  • 25
  • Similar behavior in PSPP, which is not what I wanted but yet is surprisingly awesome fidelity to the proprietary standard. – prototype Aug 05 '14 at 18:26
1

Trying an example in the current version of Statistics (22) doubled identifiers are handled correctly, however, if you generate the syntax with the Text Wizard, the fields are too short in the generated syntax, so you would need to increase the widths.

JKP
  • 5,419
  • 13
  • 5