I have a dataset with multiple columns. In some columns all values are missing, in other columns only a few values. When importing to R, the missing values are only shown as empty cells and not as NA.
This results in the problem that when I want to remove NA values with the na.omit()
function nothing happens.
How should I handle this problem?
Asked
Active
Viewed 865 times
-1

Besz15
- 157
- 7
-
related: https://stackoverflow.com/questions/24172111/change-the-blank-cells-to-na – VYago Mar 28 '22 at 16:54
-
How are you importing? – Jesse Anderson Mar 28 '22 at 17:18
2 Answers
1
You could replace all the empty values with NA
by using the following code:
library(dplyr)
your_data_with_NA <- your_data %>%
mutate_all(na_if, "")

Quinten
- 35,235
- 5
- 20
- 53
1
The easiest is often to fix this when importing your data. Most functions for importing data has some argument to specify which values should be interpreted as NA
, for example:
read.csv
and the like:na.strings
readr::read_csv
and the like:na
readxl::read_excel
and the like:na
Just set this argument to ""
, " "
or to any value that is not interpreted as NA
by R.
If this is not an option for you, you can replace the blank values with NA using:
library(dplyr)
df %>%
mutate(
across(
everything(),
~na_if(.x, "")
)
)
Note that
mutate(across(everything()))
has superseededmutate_all
but does the same.
Data
df <- mtcars %>% head()
df[1, 1] <- ""

jpiversen
- 3,062
- 1
- 8
- 12