How to exclude missing data in specific columns in R

Question

I have a df with 15,105 rows and 127 columns. I'd like to exclude some specific colunms' rows that have NA. I´m using the following command:

wave1b <- na.omit(wave1, cols=c("Bx", "Deq", "Gef", "Has", "Pla", "Ty"))

However, when I run it it returns with 19 rows only, when it was expected to return with 14,561 rows (if it should have excluded only the NA in those specific colunms requested). I'm afirming this, cause I did a subset on the df in order to test the accuracy of the missing deletion.

Does anyone could help me solving this issue? Thank you!

Since you probably want the whole `data.frame` returned it might be best to do: `wave1[rowSums(is.na(wave1[,c("Bx", "Deq", "Gef", "Has", "Pla", "Ty")])) == 0, ]` — Mike H., Apr 25 '18 at 13:39
`na.omit` does **not** have an argument `cols`. I have just tried it and it does nothing. You are probably removing all `NA` values from all columns. — Rui Barradas, Apr 25 '18 at 13:42

score 2 · Answer 1 · answered Apr 25 '18 at 13:42

2

I think this code is not efficient but it could work:

df <- data.frame(A = rep(NA,3), B = c(NA,2,3),C=c(1,NA,2))
df
   A  B  C
1 NA NA  1
2 NA  2 NA
3 NA  3  2

It removes only the rows which have missing values for the columns B and C:

df[-which(is.na(df$B)|is.na(df$C)),]
   A B C
3 NA 3 2

answered Apr 25 '18 at 13:42

Mel Sscn

96
3

2

You don't need the `which`. You can just do `!(is.na(df$B)|is.na(df$C)` – Mike H. Apr 25 '18 at 13:44
It worked, but the right argument is & instead of | so it supposed to be like this: df [ !(is.na(df$B) &! is.na(df$C), ] – Nao Apr 25 '18 at 14:51

score 0 · Answer 2 · answered Apr 25 '18 at 14:46

0

You can use complete.cases

> df[complete.cases(df[, -1]), ]
   A B C
3 NA 3 2

answered Apr 25 '18 at 14:46

Jilber Urbina

58,147
10
114
138

How to exclude missing data in specific columns in R

2 Answers2