4

I´m trying to merge different data frames which are characterized by a very large number of variables, which are named with codes such as c302, c303, etc. The original file imported from SPSS keeps very useful labels.

When I try to merge those data frames (using cbind or merge), I lose all the variable labels. Is it possible to keep them?

Lucas Sempe
  • 53
  • 1
  • 8
  • I have flaged this question, as it is not about statistics but has a focus on programming, specifically R. I think it would do better on StackOverFlow, but perhaps it would be a duplicate there. – Repmat May 08 '16 at 05:59
  • 1
    I don't work with SPSS, and haven't imported those files before. A general solution would be to figure out how the "labels" are stored using `str` and then to apply the labels to your new data.frame using `attr` or some similar tool that is relevant to the storage method. – lmo May 08 '16 at 12:03
  • 2
    If you want more help, you will need to provide more information as to how these labels are stored, and maybe provide an example data.frame or two to reproduce your problem. See `?dput` and [how to make a great reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example#answer-5963610). – lmo May 08 '16 at 12:06

1 Answers1

4

Instead of using merge, use dplyr's left_join as it preserves attributes:

library(dplyr)
library(haven)

df1 <- read_sav("one.sav")
df2 <- read_sav("two.sav")

df <- left_join(x = df1, y = df2, by = "var_name")
Tom Stewart
  • 101
  • 6
  • when you do a by = c("var_table1" = "var_table2"), var_table1 is returned in the new dataframe, but var_table2 isn't. how can get that var_table2? – nerdlyfe Feb 06 '20 at 21:24
  • 1
    In that case, var_table1 would be equal to var_table2, so it drops one of them. – Tom Stewart Mar 09 '20 at 06:51