1

I have a 17(r) by 20 (c) matrix where all data is numbers and NA. I am trying to remove all columns that has the value NA in any rows. This is 11 of the 20 columns. I've been searching for an hour and tried several methods but couldn't get it right.

my.data [ ,!is.na(my.data[ ,1:20])]   

To me this makes the most sense but is giving 'script too long' error.

Rui Barradas
  • 70,273
  • 8
  • 34
  • 66

1 Answers1

4

One basic approach would be

mydata[, !is.na(colSums(mydata))]
Ric S
  • 9,073
  • 3
  • 25
  • 51
  • 1
    Beat me to it! I was going to suggest ```mydata[ , colSums(is.na(mydata)) == 0]``` – CCP Jan 20 '20 at 16:05
  • Thank you very much. So the logic is - sum values in a column and because any NA value will turn the sum to NA...the !is.na will filter them out. –  Jan 20 '20 at 16:12
  • @RandyMarsh correct! – Ric S Jan 20 '20 at 16:13
  • Having said that.. is my initial approach not suitable for what I was trying to do or is there an obvious error I'm not seeing? thanks again Ric S –  Jan 20 '20 at 16:15
  • It gives you that error because `is.na` takes, according to the documentation, "atomic vectors, lists, pairlists, and NULL", thus no matrices or dataframes – Ric S Jan 20 '20 at 16:19