1

I have a dataframe in which there are multiple columns (more than 30) that is saved in a list. I would like to apply the same criteria for all those columns without writing each code for each columns. I have example below to help understand my problem better

A<-c("A","B","C","D","E","F","G","H","I")
B<-c(0,0,0,1,2,3,0,0,0)
C<-c(0,1,0,0,1,2,0,0,0)
D<-c(0,0,0,0,1,1,0,1,0)
E<-c(0,0,0,0,0,0,0,1,0)
data<-data.frame(A,B,C,D,E)

Let say I have the above df as an example and I have saved the list of cols as below

list <- c("B","C","D","E")

I would like to use those cols with the same criteria as below

setDT(data)[B>=1 | C>=1 | D>=1 | E>=1]

And get the following result

   A B C D E
1: B 0 1 0 0
2: D 1 0 0 0
3: E 2 1 1 0
4: F 3 2 1 0
5: H 0 0 1 1

However, is there a way to get the above answer without writing each individual column criteria (e.g. B>=1 | C>=1 ....) since I have more than 30 cols in the actual data. Thanks a lot

zx8754
  • 52,746
  • 12
  • 114
  • 209
Molia
  • 311
  • 2
  • 17
  • [see here](https://stackoverflow.com/questions/3505701/grouping-functions-tapply-by-aggregate-and-the-apply-family?rq=1) for useful information – Matthew Ciaramitaro Feb 27 '18 at 16:44
  • Possible duplicate of [Data table multiple condition with string vector](https://stackoverflow.com/questions/47652182/data-table-multiple-condition-with-string-vector) – denis Feb 27 '18 at 17:02

4 Answers4

7

For your specific example of checking if at least one value in a row is at least 1, you could use rowSums

data[rowSums(data[,-1]) > 0, ]
#   A B C D E
# 2 B 0 1 0 0
# 4 D 1 0 0 0 
# 5 E 2 1 1 0
# 6 F 3 2 1 0
# 8 H 0 0 1 1

If you have other criteria in mind, you might as well consider using any within apply

ind <- apply(data[,-1], 1, function(x) {any(x >= 1)})
data[ind,]
#   A B C D E
# 2 B 0 1 0 0
# 4 D 1 0 0 0 
# 5 E 2 1 1 0
# 6 F 3 2 1 0
# 8 H 0 0 1 1
Daniel
  • 2,207
  • 1
  • 11
  • 15
5

dplyr::filter_at will do just that.

library(dplyr)
data %>% filter_at(vars(-A),any_vars(.>=1))
#   A B C D E
# 1 B 0 1 0 0
# 2 D 1 0 0 0
# 3 E 2 1 1 0
# 4 F 3 2 1 0
# 5 H 0 0 1 1
moodymudskipper
  • 46,417
  • 11
  • 121
  • 167
2

You could always use Reduce, this is nice because you can put any type of logic you want into the function:

A simple method might be:

data[Reduce("|", as.data.frame(data[,list] >= 1)),]
#  A B C D E
#2 B 0 1 0 0
#4 D 1 0 0 0
#5 E 2 1 1 0
#6 F 3 2 1 0
#8 H 0 0 1 1

A little explanation: Reduce successively applies the same function to each element of x. In this case the "|" operator is applied to each of the logical columns of the data.frame.

If you wanted to do more complicated logic checks you could do that with your own anonymous function.

Mike H.
  • 13,960
  • 2
  • 29
  • 39
  • a more data.table way, really similar, from akrun : `data[data[, Reduce(`|`, lapply(.SD, `>=`, 1)), .SDcols = list]]` – denis Feb 27 '18 at 17:02
-1

Please check this using applyin R.

B<-c(0,0,0,1,2,3,0,0,0)
C<-c(0,1,0,0,1,2,0,0,0)
D<-c(0,0,0,0,1,1,0,1,0)
ef=data.frame(B,C,D)
con=apply(ef,2,function(x) x>1 )
ndmeiri
  • 4,979
  • 12
  • 37
  • 45
student_R123
  • 962
  • 11
  • 30