I have a data frame containing one variable "label" and want to add another variable "gender" based on information from another data frame that also contains the "label" variable. I usually use the match function and it normally works. However, this time it adds the variable, but with NAs as values. I guess this is a basic problem but I can't figure out a solution.
df1
label
1 HDJ3
2 K4JS
3 SO25
4 L9HW
df2
label gender
1 SO25 m
2 HDJ3 f
3 L9HW f
4 K4JS m
df1$gender <- df2$gender[match(df1$label, df2$label)]
What I want is
df1
label gender
1 HDJ3 f
2 K4JS m
3 SO25 m
4 L9HW f
What I get is
df1
label gender
1 HDJ3 NA
2 K4JS NA
3 SO25 NA
4 L9HW NA
EDIT: The variables are all factors. I've already tried changing them into characters, but that doesn't work either. I've also tried the merge function, but in this case the data frame was completely empty, containing only the variable names. I'd be happy if somebody could help me with that. Thanks and apologies in advance if that has been asked befor.
**Edit2: The structure of the data frame shows differences in the variables:
> dput(df1)
structure(list(label = structure(c(31L, 25L, 7L, 12L, 15L, 32L,
33L, 24L, 14L, 17L, 1L, 28L, 20L, 6L, 11L, 19L, 9L, 16L, 22L,
37L, 26L, 39L, 34L, 29L, 13L, 5L, 36L, 4L, 18L, 2L, 23L, 30L,
3L, 8L, 35L, 27L, 10L, 38L, 21L), .Label = c("09YG", "0FWR",
"0PZS", "4L78", "56C9", "5B1K", "5CL9", "5RJG", "696K", "8ZOQ",
"92MB", "95KI", "99H5", "9VOZ", "A8KP", "A9ME", "APA5", "BVDN",
"DI7S", "E4MS", "EPTR", "H34H", "HRTI", "JLSK", "K472", "KWWO",
"MHAF", "PSK5", "Q6A4", "S2CK", "S7RU", "SK7H", "SRS8", "TCFS",
"VQFM", "VWV4", "Z1GE", "ZGBU", "ZQZ7"), class = "factor")), row.names = c(NA,
-39L), class = "data.frame")
> dput(df2)
structure(list(label = c("S7RU ", "K472 ", "5CL9 ",
"95KI ", "A8KP ", "-99 ", "SK7H ", "SRS8 ", "JLSK ",
"95KI ", "-99 ", "9VOZ ", "APA5 ", "09YG ", "PSK5 ",
"E4MS ", "5B1K ", "92MB ", "DI7S ", "JLSK ", "696K "
), gender = c(3, 2, 3, 3, 3, 2, 3, 3, 3, 3, 3, 2, 3, 2, 3, 2,
3, 2, 3, 3, 3)), row.names = c(NA, -21L), class = "data.frame")
The problem I see is the blank spaces in the second variable. Can anyone tell me where this comes from and how I can fix that?