remove rows with NA values in a specific column

Question

I have a huge dataset of about 1.6 million rows, and the variable (column) I need to focus on is 'temperature'. The temperature column has many NA values, and the other variable columns have NA values throughout as well. I want to remove only the rows with NA values in the temperature column, I don't particularly care about the NA values in the other columns. How can I do this? If I end up needing to remove rows with NA values for more than just my temperature column, (eg the depth column) how can I select two columns? This is my code:

otn <- tidync(filename, row.names=TRUE) %>% activate('D0')
glider_table <- hyper_tibble(otn)
attach(glider_table)
summary(temperature)
na.omit(glider_table)

na.omit () removes all rows with NA values regardless of which column they're in, so I need something more selective.

`glider_table[!is.na(glider_table$your_col), ]` should do the trick. Also [read here](https://stackoverflow.com/questions/10067680/why-is-it-not-advisable-to-use-attach-in-r-and-what-should-i-use-instead) for why it is generally not advised to use `attach()` (just to make you aware). — Andrew, Feb 12 '20 at 20:04
Does this answer your question? [Omit rows containing specific column of NA](https://stackoverflow.com/questions/11254524/omit-rows-containing-specific-column-of-na) — camille, Feb 12 '20 at 20:26

score 1 · Answer 1 · answered Feb 12 '20 at 20:47

1

You can use the drop_na() function, the first argument is the dataset name, and the second is an optional argument where you can name the specific columns you want to remove the NA responses from. Like this , drop_na(dataset, column)

answered Feb 12 '20 at 20:47

Catie Kerman

11
1

remove rows with NA values in a specific column

1 Answers1