How to remove rows from a R data frame that have NA in two columns (NA in both columns NOT either one)?

Question

I have a R data frame df below

I want to remove the row which has NA in BOTH a & b

My desired output will be

I tried something like this below

df1<-df[(is.na(df$a)==FALSE & is.na(df$b)==FALSE),]

but this removes all the NAs (performs an OR function). I need to do AND operation here.

How do i do it ?

How about this `which(rowSums(df, na.rm = T)>0)`. – Chirayu Chamoli Nov 23 '16 at 06:51 — Chirayu Chamoli, Nov 23 '16 at 06:51
df[ rowSums( is.na(df[ , 1:2]) ) == 2, ] – IRTFM Nov 23 '16 at 07:10 — IRTFM, Nov 23 '16 at 07:10

score 2 · Answer 1 · answered Nov 23 '16 at 07:05

2

You can try :

df1<-df[!(is.na(df$a) & is.na(df$b)), ]

answered Nov 23 '16 at 07:05

Kumar Manglam

2,780
1
19
28

score 2 · Answer 2 · edited May 23 '17 at 11:45

2

using rowSums

df[!rowSums(is.na(df))==2,]

better one by saving a character[1]

df[rowSums(is.na(df))!=2,]

output:

can be generalized using ncol

df[!rowSums(is.na(df))==ncol(df),]

[1] credits: alistaire

edited May 23 '17 at 11:45

Community

1
1

answered Nov 23 '16 at 07:12

Prradep

5,506
5
43
84

1

You could save a character and just use `!=` – alistaire Nov 23 '16 at 07:26

score 1 · Answer 3 · answered Nov 23 '16 at 06:49

We can use rowSums on a logical matrix (is.na(df1)) and convert that to a logical vector (rowSums(...) < ncol(df1)) to subset the rows.

df1[rowSums(is.na(df1)) < ncol(df1),]

Or another option is Reduce with lapply

df1[!Reduce(`&`, lapply(df1, is.na)),]

score 1 · Answer 4 · answered Nov 23 '16 at 06:51

Another approach

df[!apply(is.na(df),1,all),]
#   a  b
#1  1  6
#2  2 NA
#3  3  7
#4 NA  8
#5  4  9
#7  5 10

Data

df <- structure(list(a = c(1L, 2L, 3L, NA, 4L, NA, 5L), b = c(6L, NA, 
7L, 8L, 9L, NA, 10L)), .Names = c("a", "b"), class = "data.frame", row.names = c(NA, 
-7L))

score 0 · Answer 5 · answered Nov 23 '16 at 07:00

0

this will also work:

df[apply(df, 1, function(x) sum(is.na(x)) != ncol(df)),]

   a  b
1  1  6
2  2 NA
3  3  7
4 NA  8
5  4  9
7  5 10

answered Nov 23 '16 at 07:00

Sandipan Dey

21,482
2
51
63

score 0 · Answer 6 · answered Nov 23 '16 at 07:08

My thought is basically the same with other replies.

Considering any dataset with a specific row having all NAs, the sum of !is.na(ROW) will always be zero. So you just have to take out that row.

So you can just do:

df1 = df[-which(rowSums(!is.na(df))==0),]

How to remove rows from a R data frame that have NA in two columns (NA in both columns NOT either one)?

6 Answers6