1

What is the convenient way to select the rows of several variables in a data table, that have at least one NA value. I found a way, but it is not convenient if there are many variables to select from.

Here is the working example:

library(data.table)

# Create a data table 
DT <- data.table(V1=1:5, V2=LETTERS[1:5])

# Insert some missing values
DT[c(1,3),V1 := NA]
DT[c(1,2),V2 := NA]

# Check the output
print(DT)

   V1 V2
1: NA NA
2:  2 NA
3: NA  C
4:  4  D
5:  5  E

# Select if there is at least one NA:
# My solution:

myDT <- DT[is.na(V1) | is.na(V2), ]

# Check output
print(myDT)

   V1 V2
1: NA NA
2:  2 NA
3: NA  C

So this solution works but is not convenient if there are many more variables (V1, V2, V3, ...).

Is there a better way to do it ?

Adrien
  • 461
  • 5
  • 19
  • Thanks a lot for the answer ! Also, I don't think my question is a duplicate, since the other question does not mention missing values in the title. You have to know the concept of complete/imcomplete cases to find it. That is why I didn't find any answer for 15 minutes searching, that is why I asked it. – Adrien Jun 18 '17 at 19:13
  • 1
    Negate the solutions here: [Remove rows with NAs (missing values) in data.frame](https://stackoverflow.com/questions/4862178/remove-rows-with-nas-missing-values-in-data-frame) – Henrik Jun 18 '17 at 19:14

1 Answers1

5

Use complete.cases and just take the opposite of it.

myDT <- DT[!complete.cases(V1,V2), ]
Adrien
  • 461
  • 5
  • 19
Nick Larsen
  • 18,631
  • 6
  • 67
  • 96