Remove entire Row if Column == NA in R

Question

Trying to remove this row and cannot get it. I have tried the multiple Q&A's on SO already and nothing seems to work. Tried using the sjmisc library as someone suggested but it is still there. Here is what I have tried below and a snip of the df.

EDIT:

Here is a DataFrame below to test. Removed a Pic of the dataframe which was incorrect and not proper policy.

df<-data.frame(name=c('CAREY.PRICE',NA,'JOHN.SMITH'),GA=c(3,2,2),SV=c(2,2,NA),stringsAsFactors = FALSE)

Which will return:

name       |  GA  | SV
CAREY.PRICE|  3   | 2
NA         |  2   | 2
John.Smith |  2   | NA

The issue with the below response:

df = df[complete.cases(df),]

It answers the question above technically, If a Column in any row has NA, remove it. I should have clarified that I would have like it to be a column of my choice which would have been the NA in df$name.

What this will do is remove NA and John.Smith. Which has caused an issue with my script and the lack of name(players) in my DF.

I also had na.omit() in my script and it removed NA and John.Smith as well. There are 40 variables in my DF and to write out each possible one that may or may not have an NA in it would be too much. My temporary solution is to change all NA's to 0:

df[is.na(df)] <- 0

RETURNS:

name       |  GA  | SV
CAREY.PRICE|  3   | 2
0          |  2   | 2
John.Smith |  2   | 0

Then remove any df$name that is 0:

df<-df[!(df$name==0),]

What I was looking for:

name       |  GA  | SV
CAREY.PRICE|  3   | 2
John.Smith |  2   | 0

Fixed the question above @Tjebo. Let me know if there is quicker solution. Thanks — Michael T Johnson, Sep 09 '18 at 04:29
https://stackoverflow.com/questions/11254524/omit-rows-containing-specific-column-of-na. I am using this custom made function `complete_fun` in those cases. I have it in my own personal utility package. Or, as the most upvoted answer to this question suggests or per my comment below - use `df[!is.na(df$name),]` — tjebo, Sep 09 '18 at 05:25
I will also flag this as a duplicate - really don't want to be mean. This will just help to direct people to the other question which had already very good answers. — tjebo, Sep 09 '18 at 05:35

score 2 · Answer 1 · answered Sep 08 '18 at 07:09

2

complete. cases finds the rows that don't have any NA's.

Hence, the answer to your question is:

df = df[complete.cases(df),]

answered Sep 08 '18 at 07:09

Omry Atia

2,411
2
14
27

it changes the output to a `factor` and not `dataframe` – Sal-laS Sep 08 '18 at 07:12
That's it! Thank you very much. Been struggling with this for awhile now. – Michael T Johnson Sep 08 '18 at 07:12
@MichaelTJohnson is a `factor` output acceptable? – Sal-laS Sep 08 '18 at 07:13
I think `df[!is.na(df),]` might bring the same result - better sample data by the OP would help. I don't understand why `na.omit` would not yield the desired effect though. – tjebo Sep 08 '18 at 07:33
@Salman, I am not sure the difference to be honest. My script was successful. I will have to look up what it is. – Michael T Johnson Sep 08 '18 at 07:42
Found an error with this and I will discuss this above. – Michael T Johnson Sep 09 '18 at 03:53

score 1 · Accepted Answer · answered Sep 09 '18 at 04:32

1

Here's a way to filter our rows with an NA in the name field:

library(dplyr)
df %>% filter(!is.na(name))

#>          name GA SV
#> 1 CAREY.PRICE  3  2
#> 2  JOHN.SMITH  2 NA

answered Sep 09 '18 at 04:32

Jon Spring

55,165
4
35
53

Nice, I would then apply the na convert to 0 for other variables. Thanks! – Michael T Johnson Sep 09 '18 at 05:12

Remove entire Row if Column == NA in R

2 Answers2