After creating some dummy variables, R creates some unhelpful colnames: they start with ".data_"
a <- as.factor(c("green", "yellow", "blue"))
b <- as.factor(c("blue", "yellow", "green"))
df <- data.frame(a, b)
library(fastDummies)
dummy1 <- dummy_cols(df$a, remove_selected_columns = TRUE)
dummy2 <- dummy_cols(df$b, remove_selected_columns = TRUE)
I need to put the dummys back together in a dataframe, so how do I replace the ".data_" part in each column with the name of the variable it belongs to (e.g. a_blue, a_green, a_yellow for dummy1 and b_blue, b_green, b_yellow for dummy 2)?
I found rename() but I would have to use it for every variable single handedly. Is there a more automated way?
EDIT: After using dummy_cols(), the output is a data frame with as many new variables as you have had categories for that variable before. So a with 3 categories yellow, blue and green becomes a dataframe with 3 columns called .data_blue, .data_green, .data_yellow. Those new variables are binary. Maybe this helps to illustrate what I mean.