In R, using variable names, rather than column values, to turn columns binary

Question

I have a set of categorical columns in my dataset that I'll be turning into binary variables (1/0).

There are many of these, and currently, I've called the column names and values, transferred into a word document and then used the column values directly in the code:

binarydata<- rawdata3
my_cols = c(8:38, 48:52, 59:69, 96:118, 120:132, 145:148, 154:170, 223:330) 
binarydata[my_cols] <- as.integer(!is.na(binarydata[my_cols]))

Is there a way to do it using the variable names, instead of the values?

Any help appreciated,

It's hard to give a useful answer without a [representative sample](https://stackoverflow.com/q/5963269/5325862) of data. I imagine there are already posts on SO that should answer your question, but it's hard to point you to them without actually knowing what's in your data — camille, Nov 23 '20 at 00:24
Not exactly sure if I understand your question. You already have `my_cols` as numbers now you want their equivalent column names and use it in the code? Why? — Ronak Shah, Nov 23 '20 at 03:33
in case the column numbers change. I don't want to go through the process of finding the column numbers again — Pre, Nov 23 '20 at 22:40

akrun · Accepted Answer · 2020-11-23T00:36:16.713

1

We can use colnames to subset. colnames is more general compared to names as it can also work with matrix

nm1 <- colnames(binarydata)[my_cols]
binarydata[nm1] <- lapply(binarydata[nm1], function(x) +(!is.na(x)))

Also, using the dplyr, we can specify the column names in range (:)

library(dplyr)
mtcars1 <- mtcars %>% 
      mutate(across(c(mpg:disp, wt:qsec), ~ +(!is.na(.))))

edited Nov 23 '20 at 00:36

answered Nov 23 '20 at 00:19

akrun

874,273
37
540
662

what does this do - does it change to 1 where a value is present, and 0 where value is not? – Pre Nov 23 '20 at 22:42
@Pre Yes, the `!` converts TRUE -> FALSE and viceversa. The `+` coerces TRUE ->1 and FALSE -> 0 – akrun Nov 23 '20 at 22:43

score 0 · Answer 2 · answered Nov 23 '20 at 00:25

This can also work, but it was not tested as no data was shared:

#Code
binarydata<- rawdata3
my_cols = c(8:38, 48:52, 59:69, 96:118, 120:132, 145:148, 154:170, 223:330) 
mynames <- names(binarydata)[my_cols]
binarydata[,mynames] <- as.integer(!is.na(binarydata[,mynames]))

In R, using variable names, rather than column values, to turn columns binary

2 Answers2