3

I have a vector like the following

label_names<-c("A","B","C")

I also have the following data.frame

df<- data.frame(var1=1:5,var2=1:5, var3=1:5)

What I want to do is to assign label_names to my df. Then I want to write the data.frame as .dta file to read the labelled data in Stata. Please notice that the original vector I have is larger than this, so I need a general solution.

Nick Cox
  • 35,529
  • 6
  • 31
  • 47
Enes I.
  • 125
  • 1
  • 8
  • What do you mean with "assign"? Do you want to add it to the dataframe as a fourth variable? Or do you want to change the column names (var1, var2, var3) to the labels_names? Or do you want to make a factor out of one of your variables with the labels of label names (which does not make sense here as the labels do not match up with the unique possibilities)? – Annet Dec 24 '19 at 10:50
  • 2
    Maybe `attr(df, "label_names") <- label_names`? – Rui Barradas Dec 24 '19 at 10:51
  • `names(df) <- label_names` – Allan Cameron Dec 24 '19 at 11:01
  • When I export the data after applying the methods you mentioned, STATA still does not show variable labels. @Annet label names and column names follow the same order, so labels do not need to match up with unique possibilities. – Enes I. Dec 24 '19 at 11:08
  • You need to 'assign' the label_names to the attribute label in the data frame, for that you can use the Hmisc package. I don't have Stata right now to confirm it works though. `for(i in seq_along(df)){ Hmisc::label(df[, i]) <- label_names[i] }` – csmontt Dec 24 '19 at 11:32
  • @EnesI. I do not understand your answer to my questions. Especially, as you seem to imply you want to change the columnames, while at the same time not (you want attritbutes not columnames). As chagning names is a very basic, easy to google question, I do not think you asked for that. Still one of the first answers is exaclty that, which might be caused in my opinion because the question is not that clear. Also: https://stackoverflow.com/questions/2151147/using-stata-variable-labels-in-r – Annet Dec 24 '19 at 11:46
  • 1
    https://www.statalist.org/forums/help#spelling – Nick Cox Dec 24 '19 at 11:57

1 Answers1

3

In my limited experience, R does not use labels as often as Stata. Still you can do it. You need two packages: expss and haven:

In r, use expss::apply_labels to assign variable labels and value labels. I only use variable labels for the purpose of demonstration:

df <- data.frame(year = c(2017, 2018, 2019), 
                 age = c(25, 30, 40), 
                 bmi = c(23.2, 28.3, 32))
# assign labels in r 
df  = expss::apply_labels(df, 
                   year = "Survey year",
                   age = "Age, years",
                   bmi = "BMI, kg/m2"
  )
# export to Stata
haven::write_dta(df, "havenstata.dta")

In Stata, you can check:

. des

Contains data from C:\...\havenstata.dta
  obs:             3                          
 vars:             3                          24 Dec 2019 22:17
 size:            72                          
------------------------------------------------------------------------------------------------------
              storage   display    value
variable name   type    format     label      variable label
------------------------------------------------------------------------------------------------------
year            double  %10.0g                Survey year
age             double  %10.0g                Age, years
bmi             double  %10.0g                BMI, kg/m2
------------------------------------------------------------------------------------------------------
Sorted by: 

Although there are some solutions, I think it might be easier to label data in Stata after exported to Stata.

Zhiqiang Wang
  • 6,206
  • 2
  • 13
  • 27