0

I have a R data frame df below

a   b   c

1   6  NA
2  NA  4
3   7  NA
NA  8  1
4   9  10
NA  NA  7
5   10  8

I want to remove the row which has NA in BOTH a & b

My desired output will be

a   b  c

1   6  NA
2  NA  4
3   7  NA
NA  8  1
4   9  10
5  10  8

I tried something like this below

df1<-df[(is.na(df$a)==FALSE & is.na(df$b)==FALSE),]

but this removes all the NAs (performs an OR function). I need to do AND operation here.

How do i do it ?

GaRaGe
  • 67
  • 1
  • 2
  • 9

6 Answers6

2

You can try :

df1<-df[!(is.na(df$a) & is.na(df$b)), ]
Kumar Manglam
  • 2,780
  • 1
  • 19
  • 28
2

using rowSums

df[!rowSums(is.na(df))==2,]

better one by saving a character[1]

df[rowSums(is.na(df))!=2,]

output:

   a  b
1  1  6
2  2 NA
3  3  7
4 NA  8
5  4  9
7  5 10

can be generalized using ncol

df[!rowSums(is.na(df))==ncol(df),]

[1] credits: alistaire

Community
  • 1
  • 1
Prradep
  • 5,506
  • 5
  • 43
  • 84
1

We can use rowSums on a logical matrix (is.na(df1)) and convert that to a logical vector (rowSums(...) < ncol(df1)) to subset the rows.

df1[rowSums(is.na(df1)) < ncol(df1),]

Or another option is Reduce with lapply

df1[!Reduce(`&`, lapply(df1, is.na)),]
akrun
  • 874,273
  • 37
  • 540
  • 662
1

Another approach

df[!apply(is.na(df),1,all),]
#   a  b
#1  1  6
#2  2 NA
#3  3  7
#4 NA  8
#5  4  9
#7  5 10

Data

df <- structure(list(a = c(1L, 2L, 3L, NA, 4L, NA, 5L), b = c(6L, NA, 
7L, 8L, 9L, NA, 10L)), .Names = c("a", "b"), class = "data.frame", row.names = c(NA, 
-7L))
user2100721
  • 3,557
  • 2
  • 20
  • 29
0

this will also work:

df[apply(df, 1, function(x) sum(is.na(x)) != ncol(df)),]

   a  b
1  1  6
2  2 NA
3  3  7
4 NA  8
5  4  9
7  5 10
Sandipan Dey
  • 21,482
  • 2
  • 51
  • 63
0

My thought is basically the same with other replies.

Considering any dataset with a specific row having all NAs, the sum of !is.na(ROW) will always be zero. So you just have to take out that row.

So you can just do:

df1 = df[-which(rowSums(!is.na(df))==0),]
Chris
  • 29,127
  • 3
  • 28
  • 51