-2

I want to remove all columns that are all NA. I have two solutions as shown below, but they both use looping structures under the hood.

I was wondering if there is a faster solution that does NOT require using ...apply or for loops?

PS. if I wanted to remove all rows with all NA I could avoid loops like so: r[rowSums(is.na(r)) != ncol(r), ] but can a similar solution apply to columns?

r <- data.frame(AA = c(1, NA, 3), BB = c(NA, NA, NA), CC = c(3, NA, 5) )

r[sapply(r, function(r) !all(is.na(r)))] # Solution 1

Filter(function(x) !all(is.na(x)), r)    # Solution 2
rnorouzian
  • 7,397
  • 5
  • 27
  • 72
  • 2
    Checking each column to see if it consist exclusively of `NA` values by definition means that R has to iterate that column under the hood at some point. I don't have a problem with your current solutions. – Tim Biegeleisen Oct 05 '19 at 15:57
  • @TimBiegeleisen, Not necessarily. For example, if I wanted to remove all rows with all `NA` I could avoid loops like so: `r[rowSums(is.na(r)) != ncol(r), ]` but how about columns? – rnorouzian Oct 05 '19 at 16:06

1 Answers1

0

Similarly to rowSums for rows, we can use colSums for columns

r[, colSums(is.na(r)) != nrow(r)]

#  AA CC
#1  1  3
#2 NA NA
#3  3  5
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213