1

The PISA datasets for 2000-2012 are SPSS format txt files with fixed width data columns, there are accompanying SPSS control files (syntax files?) that tell you how to parse the data. I can't seem to find a way for R to ingest this data, I've looked at haven and foreign, but haven't had any luck.

Example SPSS txt fixed width file:

https://www.oecd.org/pisa/pisaproducts/INT_Sch06_Dec07.zip

Example SPSS control file:

https://www.oecd.org/pisa/pisaproducts/PISA2006_SPSS_school.txt

Full datasets

pluke
  • 3,832
  • 5
  • 45
  • 68
  • 1
    The files look to be fixed width, readable by `read.fwf` in base R or `read_fwf` from *readr* - https://stackoverflow.com/questions/14383710/read-fixed-width-text-file - the SPSS control files list the start and end characters for each column - e.g. `1 - 5` and `6 - 10` in the control file translates to `widths=c(5,5)` or `fwf_widths=c(5,5)` in the R import functions respectively. There may well be some manual handling required to grab all those details from the control file. – thelatemail Oct 18 '22 at 21:06

1 Answers1

2

You can use the EdSurvey R package to analyze PISA data. It's designed for large-scale studies such as PISA and handles lots of the 'grunt' work for data prep and weighting. The downloadPISA function will retrieve the data from OECD and the readPISA function works to parse the syntax scripts and prep the data into the EdSurvey environment for analysis.

Fink
  • 3,356
  • 19
  • 26
  • Thanks. Is the control file txt model a bespoke oecd thing? It's surprising that haven and foreign don't seem to handle it – pluke Oct 19 '22 at 09:09
  • 1
    Most OECD and NCES data I've run into use a similar theme where the raw data is a fixed-width data file, then they provide various SAS, SPSS, STATA scripts for reading data into those particular software. I'm assuming this is to try to avoid stale/incompatible data formats over time. Trying to parse these control files can be very difficult and is proprietary so not a lot of information is available. – Fink Oct 19 '22 at 15:24