In R how to delete rows that have missing values for all variables?. I want to keep the rest of the rows that have records with some missing values. I have tried the code posted here previously and it is not working.
Asked
Active
Viewed 200 times
3 Answers
0
mydf is your data table
mydf[complete.cases(mydf)]
mydf is your data frame
subset(mydf,complete.cases(mydf))

Frostic
- 680
- 4
- 11
0
If df
is your data table.
df[rowSums(is.na(df))!=ncol(df), ]
TEST
> df <- matrix(c(1, NA, 2, 3, 4, NA, NA, 6, 7, NA, NA, NA),c(4, 3))
> df
[,1] [,2] [,3]
[1,] 1 4 7
[2,] NA NA NA
[3,] 2 NA NA
[4,] 3 6 NA
> df[rowSums(is.na(df))!=ncol(df), ]
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 NA NA
[3,] 3 6 NA

Song Zhengyi
- 339
- 2
- 8
0
If I understand the question correctly, you do not want to remove the rows with any missing data, which is what complete.cases
does, but those with all values missing.
library(tibble)
df <- tribble(
~x, ~y, ~z,
1, NA, 2,
NA, NA, NA, # remove this row
NA, 3, 4,
5, NA, NA
)
Here we want to remove the second row and only the second.
You can apply
over the table and get a boolean for whether all values are missing like this:
to_remove <- apply(
df, 1, # map over rows
function(row) all(is.na(row))
)
and then you can keep those where to_remove
is FALSE
.
df[!to_remove,]

Thomas Mailund
- 1,674
- 10
- 16