1

I'm attempting to remove all colSums !=0 columns from my matrix. Using this post I have gotten close.

data2<-subset(data, Gear == 20) #subset off larger data matrix
 i <- (colSums(data2[,3:277], na.rm=T) != 0) #column selection in this line limits to numeric columns only 
 data3<- data2[, i]  #select only columns with non-zero colSums

This creates a vector, i, that properly identifies the columns to be removed using a logical True/False. However, the final line is removing columns by some logic other than my intentional if true then include if false then exclude. The goal: remove all columns in that range that have a colSums == 0 . The problem: my current code does not seem to properly identify said columns Suggestion as to what I'm missing?

Update: Adding dummy data to use:

a<-matrix(1:10, ncol = 10,nrow=10)
a
a[,c(3,5,8)]<-0
a
i <- (colSums(a, na.rm=T) != 0) 
 b<- a[, i] 

it works well here, so I'm not sure why it won't work above on real data.

Jesse001
  • 924
  • 1
  • 13
  • 37
  • It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick May 12 '20 at 20:10
  • The only real problem I see is changing variable names. You check for column sums in `data2` but then you are subsetting `n20` in the final line. Those are probably completely different data.frames – MrFlick May 12 '20 at 20:12
  • You can use `n20[, c(1:2, which(i) + 2)]` – akrun May 12 '20 at 20:15
  • @MrFlick you are correct on the data frame referencing. That was an editorial mistake bringing it to the question from R. I also just went to make a reproducible example with dummy data and this line works perfectly fine on the dummy data... grrr. Also: I updated the original post as per your comments – Jesse001 May 12 '20 at 20:16

1 Answers1

1

We can get the column names from the colSums and concatenate with the first two column names, which was not used for creating the condition with colSums to select the columns of interest

data[c(names(data)[1:2], names(which(i1)))]
akrun
  • 874,273
  • 37
  • 540
  • 662
  • The original post generates the names vector well. It's just not sub-setting the new matrix by that list. – Jesse001 May 12 '20 at 20:21
  • @Jesse001 the original post generates a logical vector. Here, it is subsetting that logical vector to get only the column names where there is TRUE values – akrun May 12 '20 at 20:22
  • @Jesse001 you can check `v1 <- setNames(c(TRUE, FALSE, TRUE, FALSE), 1:4)# names(which(v1))` – akrun May 12 '20 at 20:23
  • 1
    that seems to have done the trick, thanks. Sorry I misunderstood the answer originally. – Jesse001 May 12 '20 at 20:26