2

I want to remove rows from a data frame where a column has NA only if the other rows where the NA value is found matches others value in the data frame

For example,

df <- data.frame(ID = c(1,1,2,2),DAY=c(1,1,2,3), VAL=c(1,NA,NA,5))

I want to remove the second row because there is a missing value in VAL and there is already a value for VAL with ID = 1 and DAY = 1 to get

ID      DAY     VAL
1        1        1
2        2        NA
2        3        5

Any idea how to do this? I could try writing a loop, but that doesn't seem efficient.

Dale Wu
  • 199
  • 1
  • 11
  • possible duplicate of [How to remove "rows" with a NA value?](http://stackoverflow.com/questions/6138776/how-to-remove-rows-with-a-na-value) – MichaelChirico Jul 28 '15 at 20:26
  • 1
    @MichaelChirico I didn't find it as a duplicate because `na.omit` is more general and it omits all the rows with the `NA` values. But, here the OP wants to remove NA values selectively by group. – akrun Jul 28 '15 at 20:33

1 Answers1

3

This could be done by creating a logical condition within each grouping variable. Here, I am guessing 'ID' and 'DAY' as the grouping variable.

One option is using data.table. We convert the 'data.frame' to 'data.table' (setDT(df)). Grouped by 'ID', 'DAY', we get the row index (.I) of 'VAL' that satisfy the condition (sum(is.na(VAL))!= .N & is.na(VAL)) and remove the that row index from the dataset 'df'. The sum(is.na(VAL))!= .N gives a logical vector that checks whether the number of NA values in a group is not equal to number of rows in that group (.N). If the 'VAL' is also an 'NA' along with the previous condition, then that is removed.

library(data.table)
i1 <- setDT(df)[, .I[sum(is.na(VAL))!=.N & is.na(VAL)] , by = .(ID, DAY)]$V1
df[-i1]
#   ID DAY VAL
#1:  1   1   1
#2:  2   2  NA
#3:  2   3   5

Or a similar option with dplyr. We group by 'ID', DAY', and then use filter with the above mentioned condition.

library(dplyr)
df %>% 
   group_by(ID, DAY) %>% 
   filter(!(sum(is.na(VAL))!=n() & is.na(VAL)))
#  ID DAY VAL
#1  1   1   1
#2  2   2  NA
#3  2   3   5
akrun
  • 874,273
  • 37
  • 540
  • 662