-1

Dataset attribute headings

enter image description here

I am a beginner and I am trying something like this:

for (i in newTrain) {
 count = 0
 count = length(which(is.na(newTrain$i)))
 names(-which(count>100))
}  

but this isn't working at all for me.

Jaap
  • 81,064
  • 34
  • 182
  • 193
Muhammad Wasif
  • 145
  • 1
  • 2
  • 10
  • 1
    Welcome to StackOverflow! Please read the info about [how to ask a good question](http://stackoverflow.com/help/how-to-ask) and how to give a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610). This will make it much easier for others to help you. – Jaap Oct 27 '16 at 11:47
  • newTrain[apply(newTrain, 2, function(x) sum(is.na(x))<=100)] – Lourdes Hernández Oct 27 '16 at 11:51
  • @LourdesHernández no need to use `apply` to `sum` on each `column`, there is a `colSums` function (see the posted answer) – Cath Oct 27 '16 at 11:52

1 Answers1

3

We could first apply is.na for the entire dataframe and then sum the value of NAs for every column. Then select columns which have NA value less than 100.

newTrain[colSums(is.na(newTrain)) < 100]
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213