I found sort of the reverse question here: R: Replace multiple values in multiple columns of dataframes with NA
But I couldn't make it work with my data. In my case, I want to find the NA's and replace them with the value from another column.
I have a dataset dta1 in which there are 2493 variables I am interested in manipulating. Aside from these 2493 variables there's a column var_fill. When any of the columns named in vars is NA I want to fill it in with the value from var_fill. I tried reverse engineering the solution posted above but it gives me multiple warnings of:
1: In `[<-.factor`(`*tmp*`, list, value = structure(c(16946L, ... : invalid factor level, NA generated
2: In x[...] <- m : number of items to replace is not a multiple of replacement length
And also just doesn't work.
vars <- sprintf("var%0.4d",seq(1:2493))
dta1[vars] <- lapply(dta1[vars], function(x) replace(x,is.na(x), dta1$var_fill) )
I apologize but because of the size of this data I couldn't generate a full reproducible dataset so I heavily subsetted it but I am working with about 3000 columns and 240K rows of data.
Here's the data: https://drive.google.com/file/d/1oj_nhd99ftgN1Bh930_IRQftLACR2FO9/view?usp=sharing
It's too big to post even though it's only 10 people.