Assign a value of 1 to all positive counts across several columns in R

Question

I have a data table which looks like the following:

library(data.table)

Well <- c("A1", "A2", "A3", "A4", "A5")
Episyrphus <- c(0,0,5,3,1)
Eupeodes <- c(2,0,4,1,0)
Syrphus <- c(0,4,0,3,2)

dt <- data.table(Well, Episyrphus, Eupeodes, Syrphus)
dt

   Well Episyrphus Eupeodes Syrphus
1:   A1          0        2       0
2:   A2          0        0       4
3:   A3          5        4       0
4:   A4          3        1       3
5:   A5          1        0       2

What I want to do is change all of the non-zero values in the species columns (Episyrphus, Eupeodes, Syrphus) to 1, so that the table becomes:

   Well Episyrphus Eupeodes Syrphus
1:   A1          0        1       0
2:   A2          0        0       1
3:   A3          1        1       0
4:   A4          1        1       1
5:   A5          1        0       1

However, my problem is that I haven't been able to replace non-zeros with 1 across several columns. The solution I've reached for a single column is:

dt[Episyrphus > 0, Episyrphus := 1L]

But is there any way to use this same code structure, with perhaps a for loop(?), to perform the same function across all species columns?

I feel like the answer is probably obvious, but it's not to me! Any help would be much appreciated.

Sorry for the duplicate question, I hadn't managed to find the earlier question cited. — D.Hodgkiss, Apr 20 '18 at 05:31

score 1 · Answer 1 · answered Apr 18 '18 at 14:34

This is a dplyr approach

dt %>% dplyr::mutate_if(is.numeric, funs(sign(.)))
#   Well Episyrphus Eupeodes Syrphus
# 1   A1          0        1       0
# 2   A2          0        0       1
# 3   A3          1        1       0
# 4   A4          1        1       1
# 5   A5          1        0       1

Martin Schmelzer · Accepted Answer · 2018-04-18T13:52:59.803

0

You can do

cols <- seq(ncol(dt))[-1]
dt[ , (cols) := lapply(.SD, as.logical), .SDcols = cols]

edited Apr 18 '18 at 13:52

answered Apr 18 '18 at 13:44

Martin Schmelzer

23,283
6
73
98

1

The appropriate data type is logical. If it must be integers, they can be created easily from logical values. `ifelse` is inefficient and should always be avoided. `dt[, 2:4 := lapply(.SD, as.logical), .SDcols = 2:4]` – Roland Apr 18 '18 at 13:48
Point taken :) The key for this question I guess is how to apply a transformation on a subset of columns. I'll incorporate your advise. – Martin Schmelzer Apr 18 '18 at 13:50
Probably not better in terms of efficiency, but there's also `replace(x, x !=0, 1L)` -- more readable for me anyways. Btw, you use `cols` but do not define it first. Re "apply a transformation on a subset of columns" that q is over here: https://stackoverflow.com/q/16846380/ – Frank Apr 18 '18 at 13:51

Assign a value of 1 to all positive counts across several columns in R

2 Answers2