3

I am trying to apply SPSS style category labels to my dataset in R. I think my question arises as I do not know how to parse variables correctly, so is not necessarily related to just these types of data. To begin with, doing this manually as per the expss library documentation works fine:

library(expss)

#Load in the data
data(mtcars)

#Apply Variable Labels and Value Labels (and Numeric Coding) to each Variable.
mtcars = apply_labels(mtcars,
                      vs = "Engine",
                      vs = c("V-engine" = 1,
                             "Straight engine" = 2,
                             "Other engine" = 3)
)

Now my problem arises if I have my "Variable Names", "Variable Labels", "Value Labels" and corresponding "Value Numeric Codes" stored in some R data type and I try to use them in the apply_labels function. For example, if I have these stored in character vectors like so:

#Load in the data
data(mtcars)

#Value Labels
value_lab<-c("V-engine","Straight engine","Other engine")
#Value's Numeric coding
value_num<-c("1","2","3")

#Variable names
var <- c("vs")
#Variable Labels
var_lab<-c("Engine")

Then my question is, how would I use my character vector elements inside the apply_labels function? e.g. how would I do something like this:

#Apply Variable Labels and Value Labels (and Numeric Coding) to each Variable.
mtcars = apply_labels(mtcars,
                      var[1] = var_lab[1],
                      var[1] = c(value_lab[1] = value_num[1],
                                 value_lab[2] = value_num[2],
                                 value_lab[3] = value_num[3])
)

I have tried various combinations of paste and toString without success. My next step will be to apply this to my 500,000+ rows x 20,000 columns of data with a to-be-determined number of possible Value Labels/Numeric Codings. Obligatory: I am new to R. Thank you.

rhacker007
  • 33
  • 3

2 Answers2

4

To achieve your desired result

  1. Make use of named lists and vectors to store your variable and value labels
  2. Doing so you can make use of do.call to pass the variable and value labels to apply_labels

To make the example more interesting I added labels for a second variable.

library(expss)

# Variable Labels
var_labels <- list(vs = "Engine", am = "Transmission")
#Value Labels
val_labels <- list(
  vs = c("V-engine" = 0, "Straight engine" = 1),
  am = c("Automatic" = 0, "Manual" = 1)
)

mtcars2 <- do.call(apply_labels, c(list(data = mtcars), var_labels, val_labels))

table(mtcars2$am, mtcars2$vs)
#>            
#>             V-engine Straight engine
#>   Automatic       12               7
#>   Manual           6               7
stefan
  • 90,330
  • 6
  • 25
  • 51
0

Great, thank you! That has led me to understand named lists and build a solution with setNames.

I ended up not using expss. It appeared to work within R and labelled everything as expected, but when I exported the final dataframe from R to SPSS using haven::write_sav, the value labels were not maintained (but the variable labels were).

Instead I used the haven labelled vector class to apply the Variable and Value labels. My final solution looks like this:

#Load in the data
data(mtcars)

#Variables
var <- c("vs")
#Variable Labels
var_labels<-c("Engine")

#Value Labels (for first Variable)
value_labs<-c("V-engine","Straight engine","Other engine")
#Value's Numeric coding )
value_num<-c("1","2","3")

#Make a named list to use as the value labels
value_labels <- setNames(as.integer(value_num),value_labs)

#Apply the label with haven
mtcars[,c(var[1])]<-labelled(mtcars[, c(var[1])],
                              labels=value_labels,
                              label=var_labels[1])

#Save out in spss format
haven::write_sav(mtcars, "test.sav")

Also, I have set it up so my data comes in one grouping of values labels at a time, but your example of expanding to the second variable helped me generalise this too, so thanks again!

rhacker007
  • 33
  • 3