Removing all columns with all NAs in a data.frame without a loop in R

Question

I want to remove all columns that are all NA. I have two solutions as shown below, but they both use looping structures under the hood.

I was wondering if there is a faster solution that does NOT require using ...apply or for loops?

PS. if I wanted to remove all rows with all NA I could avoid loops like so: r[rowSums(is.na(r)) != ncol(r), ] but can a similar solution apply to columns?

r <- data.frame(AA = c(1, NA, 3), BB = c(NA, NA, NA), CC = c(3, NA, 5) )

r[sapply(r, function(r) !all(is.na(r)))] # Solution 1

Filter(function(x) !all(is.na(x)), r)    # Solution 2

Checking each column to see if it consist exclusively of `NA` values by definition means that R has to iterate that column under the hood at some point. I don't have a problem with your current solutions. — Tim Biegeleisen, Oct 05 '19 at 15:57
@TimBiegeleisen, Not necessarily. For example, if I wanted to remove all rows with all `NA` I could avoid loops like so: `r[rowSums(is.na(r)) != ncol(r), ]` but how about columns? — rnorouzian, Oct 05 '19 at 16:06

score 0 · Accepted Answer · answered Oct 05 '19 at 16:04

0

Similarly to rowSums for rows, we can use colSums for columns

r[, colSums(is.na(r)) != nrow(r)]

#  AA CC
#1  1  3
#2 NA NA
#3  3  5

answered Oct 05 '19 at 16:04

Ronak Shah

377,200
20
156
213

Removing all columns with all NAs in a data.frame without a loop in R

1 Answers1