-1

How can I label a variable with the label stored in a data frame in R.

I have a data frame like this:

Name Sex
Ind 1 1
Ind 2 2
Ind 3 1
Ind 4 2
Ind 5 2
Ind 6 1
Ind 7 1
... ...

And a data frame with the labels for Sex

Code Label
1 Male
2 Female

How can I label the Sex variable in the first data frame with the labels stored in the second data frame?

  • Check the 'labelled' Package, specifically the 'set_val_labels' function. Alternatively, you can turn your variable into a factor variable. – deschen Jul 29 '21 at 05:22
  • `inner_join(df, labels, by = c(Sex = "Code")) %>% select(Name, Sex = Label)` – andrew_reece Jul 29 '21 at 05:35
  • Do you just want to merge the data? Or replace it? Base R doesn't really do "labels" for values. Are you coming from a different language or something? Or what exactly is the requirement? What do you need to do with the data after you've made the transformation? – MrFlick Jul 29 '21 at 05:36

1 Answers1

1

There are several ways to do this.

Note: For future use and finding friends here, please always provide a reproducible example.

example data

df <- data.frame(Name = c("Ind1","Ind2","Ind3","Ind4","Ind5"), Sex = c(1,2,1,2,2))
df2<- data.frame(Code = c(1,2), Label = c("Male","Female"))

label by position You can use the index of a vector element to "overwrite" it with another vector of "labels".

The following stores the result in a new column. Obviously, you can overwrite your Sex column with the result.

df$Sex2 <- df2$Label[df$Sex]

joining The other more general way to do this is to join your data frames.

library(dplyr)   # for pipe and left_join()

df <- df %>% 
  left_join(df2
          , by = c("Sex"="Code")   # define columns for the join
)

This creates the Label column which you need to further process.

Example output:

df
  Name Sex  Label   Sex2
1 Ind1   1   Male   Male
2 Ind2   2 Female Female
3 Ind3   1   Male   Male
4 Ind4   2 Female Female
5 Ind5   2 Female Female
Ray
  • 2,008
  • 14
  • 21