You could try
df$head4 <- +(rowSums(is.na(df))==ncol(df))
# head1 head2 head3 head4
#1 34 32 6 0
#2 NA NA NA 1
#3 45 NA 11 0
#4 54 15 98 0
#5 45 56 NA 0
#6 3 1 78 0
#7 NA 5 NA 0
In this case rowSums()
counts the NA
values in each row. If all entries in the row are NA
, this sum is equal to the total number of columns of the data.frame and the comparison with ==ncol(df)
returns TRUE
. Else the result is FALSE
. The Boolean vector can be coerced into numeric values (0/1) by adding the +
sign in front, which is a short hand notation for as.numeric()
in this case.
Hope this helps.
Since there has been a comment by @RichardTelford concerning the speed of the different answers, I tried to verify whether his claim according to which one of the other answers would be twice as fast as this one is true.
m <- matrix(runif(1e6),ncol=4)
nas <- sample(1e6,0.3*1.e6)
m[nas] <- NA
df <- as.data.frame(m)
library(microbenchmark)
frowsums <- function(x) {+(rowSums(is.na(x))==ncol(x))}
flapply <- function(x) {Reduce(`&`, lapply(x, is.na)) + 0L}
frowmeans <- function(x) {1*(rowMeans(is.na(x)) == 1)}
res <- microbenchmark(
frowsums(df),
flapply(df),
frowmeans(df), times=1000L)
res
Unit: milliseconds
expr min lq mean median uq max neval cld
frowsums(df) 15.75257 16.63475 20.23377 17.14405 17.82396 80.63485 1000 b
flapply(df) 15.16721 15.23180 18.19778 16.13413 16.60948 88.92303 1000 a
frowmeans(df) 16.61643 17.56909 20.69433 18.03498 18.83867 81.54057 1000 b
As the results show, @RichardTelford's claim is not correct. There is hardly any difference in speed between the three solutions, which means that the simplest version and the one that is more easily understood should be preferable from a programmer's perspective.