1

I have a data frame and I want to remove all columns with less than 1000 observations. The approach below works fine, but is there any more elegant solution?

vec <- numeric()

for(i in 1:ncol(dat))
{
    if(length(dat[,i][!is.na(dat[,i])]) >= 1000) 
        vec <- c(vec, i)
}

dat <- dat[,vec]
Roman Luštrik
  • 69,533
  • 24
  • 154
  • 197
vitor
  • 1,240
  • 2
  • 13
  • 27
  • 1
    Please add reproducible sample for good people here to help you. See http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example What is structure `dat`? Can you paste output of `dput(dat)` here – CHP Mar 18 '13 at 17:33
  • This is a pretty broad question that applies to all `data.frame`s so I don't know what you'd expect to learn by seeing the object. `dput(head(dat))` would be a much better idea btw, since he's talking about 1000s of observations. – Señor O Mar 18 '13 at 18:56

1 Answers1

7

This should work:

dat[,colSums(!is.na(dat))>=1000]

Here we first look which elements in dat are no NA, and compute columns sums of this new data frame. For those columns which contain at least 1000 observations we get TRUE and for others FALSE. So we can use it as an index variable which subsets original dat data frame.

Jouni Helske
  • 6,427
  • 29
  • 52
  • 1
    while most logical answer in absence of data, we don't know yet what's structure of `dat`. Ah well +1 anyway – CHP Mar 18 '13 at 17:35
  • You can change `>` for `>=` to get exactly the same the OP wants. – Ferdinand.kraft Mar 18 '13 at 17:36
  • @geektrader He did say data frame and that his code works, so I though this should work always in those limits. But yeah, it would be nice to have somekind of note in Ask Question form about "dputting" your data.. – Jouni Helske Mar 18 '13 at 17:43
  • @aguiar, why don't you accept the answer (to this one and your other questions)? – Arun Mar 18 '13 at 17:45
  • @Arun I can only accept it after 10 minutes. It's accepted now. – vitor Mar 18 '13 at 18:14
  • @aguiar, yes, forgot about that. Also consider accepting to your other questions. – Arun Mar 18 '13 at 18:18