-1

I have a data.table in R with 200 columns with integer values.

One of the columns is named group and it has 100 different possible values.

So, when I subset using: subDT<-DT[group==N], for instance, and if I do, sum(subDT$columnX), maybe the value of the sum will be 0.

So what I want to do is to display subDT, but only the columns where sum(subDT$columnN)!=0, something like subDT[group==0,.(columns where sum(column)>0)], keeping the names of the columns intact of course.


EDIT

An example using the mtcars data would be:

DT<-as.data.table(mtcars)

Let's say that we want to subset mtcars and get the rows where carb is 1, but display the columns only if the sum of the subset is less than 10:

DT[carb == 1, (sapply(DT[carb == 1],sum) < 10), with = FALSE]  

In this case, the columns that will be displayed are only vs,am and carb because the sum of those columns is less than 10

David Arenburg
  • 91,361
  • 17
  • 137
  • 196
Felipe
  • 8,311
  • 2
  • 15
  • 31

1 Answers1

3

Assuming that the sum is taken after the filter:

DT[group == N, !(sapply(DT[group == N],sum) == 0), with = FALSE]

Can be made faster with setkey:

setkey(DT,group)
DT[N, !(sapply(DT[N],sum) == 0), with = FALSE]
Chris
  • 6,302
  • 1
  • 27
  • 54