0

I'm attempting to create a vector of column names which contain one or more NA values using a for loop, but am not having success.

hasnas <- c()

for (i in 1:length(data)){
  if(sum(is.na(data[,i]))>0){
    hasnas <- append(hasnas,names(data[,i]))
 
  }
         
  
}
  
>hasnas
>NULL

Any help would be sincerely appreciated.

  • 1
    It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick Mar 26 '21 at 01:45
  • cleaner code : `if( any(is.na(data[,i])) ){...}` – IRTFM Mar 26 '21 at 01:58

4 Answers4

2

Couple of base R options :

#Option 1 
hasnas <- names(data)[colSums(is.na(data)) > 0]

#Option 2
hasnas <- names(Filter(function(x) any(is.na(x)), data))
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
1

This doesn't use a loop, but R's apply function:

d <- data.frame(a = 1:2, b = c(1, NA), c = c(NA, NA), d = 1:2)

o <- apply(d, 2, function(x) any(is.na(x)))

names(o[sapply(o, isTRUE)])

[1] "b" "c"
Paul
  • 2,877
  • 1
  • 12
  • 28
0

You just need to change names(data[,i]) to names(data)[i]. See a full reprex below:

data <- iris
data[["Sepal.Length"]][sample(100, 10)] <- NA
data[["Species"]][sample(100, 10)] <- NA

hasnas <- c()

for (i in 1:length(data)) {
  if(any(is.na(data[, i]))) {
    hasnas <- append(hasnas, names(data)[i])
  }
}

hasnas
#> [1] "Sepal.Length" "Species"

Created on 2021-03-25 by the reprex package (v1.0.0)

0

There's a concise (and elegant, I think) way you can achieve this by using purrr

data <- tibble(
  a = rep(NA, 10),
  b = rnorm(10),
  c = rep(NA, 10)
)

Loading purrr package: library(purrr)

names(data)[map_lgl(data, ~any(is.na(.x)))]
[1] "a" "c"