I am working with a large dataset downloaded dataset with the end goal of joining many data frames.
For the past week or so, I have been unable to join data frames due to incompatibility of the data types "labelled" vs. "character." Ultimately I would like to map my function to the same variable in multiple data frames in a list.
Structure of each df is as follows (edited to change variables/attr names because I cannot share the data). The variable of interest I'm working with here is "CODE":
structure(list(VAR1 = structure(c(val, val, val, val, val, val), .Label = c("a",
"b", "c", "d"), class = "factor"), ID = c(1,
2, 3, 4, 5, 6), CODE = structure(c("c1", "c1",
"c1", "c1", "c1", "c1"), label = "instance code", units = "-4", class = c("labelled",
"character")), ...
I'm still relatively new to R/RStudio, so I thought for a while my issue was with mapping throughout the list, but when I pick one element to remove labels, it still doesn't work. It is almost as if R doesn't know that the label is there, despite the fact that when I use get_label, the label shows up (function below).
get_label(my.list[["my.df"]][["my.variable"]]
I have tried the following methods (I am showing it as if I am working with a single variable instead of the whole list, which is how I have been experimenting for the last couple of days):
- The class function. Interestingly, when I call this back, it says the class is character; however, when I look at the dataframe, the class still says "character [# of elements] (S3: labelled, character)"
class(my.list[["my.df"]][["my.variable"]] <- "character"
- remove_label function
remove_label(my.list[["my.df"]][["my.variable"]]
- unclass function. This one worked for one variable at a time, but did not map over the whole list, so I am including my mapping code in case that is the issue in this case.
## for one variable
unclass(my.list[["my.df"]][["my.variable"]])
## for entire list
my.list %>%
map_at("my.variable", ~ unclass)
## I also tried map in case it was a map_at issue--still didn't work.
- zap_label
zap_label(my.list[["my.df"]][["my.variable"]])
- setting the attribute to null
attr(my.list[["my.df"]][["my.variable"]], "label") <- NULL
- as.character
as.character(my.list[["my.df"]][["my.variable"]])
Does anyone have any ideas? Could it be a bug in R, or is it just my relative inexperience with R showing?
I have also tried modifying these functions in case I was misinterpreting the label and it was value labels instead of variable labels causing the issue. It's not!
Thanks for any assistance!