0

I have large number of data files that are either SPSS or text. When importing the SPSS files in R using read.spss from the library foreign, value labels are added automatically when using use.value.labels = TRUE. These are stored as a value.labels attribute of each column of the data frame. I need to keep the imported objects' structure consistent no matter what is their source (SPSS or text). I need to assign value.labels attribute and its values to each non-numeric column (either factor or character) in the data frame imported from the text files. Here is an excerpt from a data frame imported from a text file:

> mydf <- data.frame(w = factor(c(1, 2, 3)), x = c("fourth", "fifth", "sixth"),
y = c(9.3, 8.8, 2.6), z = factor(c(7, 8, 9)), stringsAsFactors = FALSE)

I can do the following column by column:

> attr(mydf$w, "value.labels") <- c(first = "1", second = "2", third = "3")
> attr(mydf$x, "value.labels") <- c(f4 = "fourth", f5 = "fifth", f6 = "sixth")
> attr(mydf$z, "value.labels") <- c(seventh = "7", eighth = "8", ninth = "9")

And then check:

> attributes(mydf$w)
$levels
[1] "1" "2" "3"

$class
[1] "factor"

$value.labels
first second  third 
   "1"    "2"    "3" 

However, with large number of data frames each containing numerous columns this is not efficient. Is it possible to do it automatically given a list of value labels such as:

> lst.attr <- list(w = c(first = "1", second = "2", third = "3"),
x = c(f4 = "fourth", f5 = "fifth", f6 = "sixth"), z = c(seventh = "7",
eighth = "8", ninth = "9"))
panman
  • 1,179
  • 1
  • 13
  • 33
  • You could try `Map("attr<-", mydf[-3], "value.labels", lst.attr)` but I'm not sure that would be any more efficient – Rich Scriven Jan 24 '15 at 21:28
  • @ Richard: Thank you very much! Your solution works, the only issue is that for the character column (`mydf$x`) no `value.labels` attribute was added (well, I know it is not a factor). Can this be fixed? – panman Jan 24 '15 at 22:17
  • That code is basically the same as Sven's answer. You could substitute his `res <- ` call with my code from above, substituting the second argument as `mydf[idx]` – Rich Scriven Jan 25 '15 at 04:47

0 Answers0