I have a dateframe with three columns, one of which has a variable for participant ID with no NA values and the other two (target variables) have some scattered throughout. I'm trying to use the solution explained here (remove rows where all columns are NA except 2 columns) to remove rows where both of the target variables have NAs, but for some reason my implementation of it seems to indiscriminately remove all NAs.
Here is a sample of what the unprocessed df looks like:
ID | a | b |
---|---|---|
1 | ab | NA |
1 | NA | ab |
1 | NA | NA |
Here is what I want the processed df to look like:
ID | a | b |
---|---|---|
1 | ab | NA |
1 | NA | ab |
And here is the code I'm using to try to accomplish this:
na_rows = df %>%
select(-"ID") %>%
is.na() %>%
rowSums() > 0
processeddf <- df %>%
filter(!na_rows)
However, this code returns a df which has removed any row containing NA at all. So for the above sample, it would return an empty df. Where am I going wrong here? I can't figure out where my logical error is occurring.