I have a data frame containing entries; It appears that these values are not treated as NA since is.na returns FALSE. I would like to convert these values to NA but could not find the way.
Asked
Active
Viewed 1.9k times
6
-
I'm guessing your talking about doing this in R? Otherwise, na is pretty ambiguous... north america? not available? – Marc B Oct 06 '14 at 16:48
-
Yes sorry in R; NA stands for missing value – user34771 Oct 06 '14 at 16:55
-
2Provide a sample of your data by adding the output of `dput(your.data.frame[some.rows.that.contain.such.values,])` to your question. – Roland Oct 06 '14 at 17:05
-
The results of `str(your.data.frame)` would also be useful to let us see how the columns are stored. – Greg Snow Oct 06 '14 at 17:35
3 Answers
5
Use dfr[dfr=="<NA>"]=NA
where dfr
is your dataframe.
For example:
> dfr<-data.frame(A=c(1,2,"<NA>",3),B=c("a","b","c","d"))
> dfr
A B
1 1 a
2 2 b
3 <NA> c
4 3 d
> is.na(dfr)
A B
[1,] FALSE FALSE
[2,] FALSE FALSE
[3,] FALSE FALSE
[4,] FALSE FALSE
> dfr[dfr=="<NA>"] = NA **key step**
> is.na(dfr)
A B
[1,] FALSE FALSE
[2,] FALSE FALSE
[3,] TRUE FALSE
[4,] FALSE FALSE

Ujjwal
- 3,088
- 4
- 28
- 36
4
The two classes where this is likely to be an issue are character and factor. This should loop over a dtaframe and convert the "NA" values into true <NA>
's but just for those two classes:
make.true.NA <- function(x) if(is.character(x)||is.factor(x)){
is.na(x) <- x=="NA"; x} else {
x}
df[] <- lapply(df, make.true.NA)
(Untested in the absence of a data example.) The use of the form: df_name[]
will attempt to retain the structure of the original dataframe which would otherwise lose its class attribute. I see that ujjwal thinks your spelling of NA has flanking "<>" characters so you might try this functions as more general:
make.true.NA <- function(x) if(is.character(x)||is.factor(x)){
is.na(x) <- x %in% c("NA", "<NA>"); x} else {
x}

IRTFM
- 258,963
- 21
- 364
- 487
-
Thanks for help. The problem is that I do not manage to make a reproducible example in which I obtain both NA and
. The function of BondedDust allowed me to transform both NA and – user34771 Oct 06 '14 at 20:32in true NA (they appear all TRUE with is.na(df)), but the structure of my df shows that the variables that contain entries are coded as factor and not as numeric. -
I suspect you would not want to make a conversion of all character vectors to numeric so you might want to apply this conversion just to particular columns: `dfrm[targets] <- lapply( dfrm[targets], make.true.NA) ; dfrm[targets] <- lapply( dfrm[targets], as.numeric)` – IRTFM Oct 06 '14 at 21:01
-
Yes, I have to convert to numeric, but it works only if I unlist my dataframe first. I have no idea why it appears as list, but at least it is ok. – user34771 Oct 07 '14 at 06:45
1
You can do this with the naniar package as well, using replace_with_na
and associated functions.
dfr <- data.frame(A = c(1, 2, "<NA>", 3), B = c("a", "b", "c", "d"))
library(naniar)
# dev version - devtools::install_github('njtierney/naniar')
is.na(dfr)
#> A B
#> [1,] FALSE FALSE
#> [2,] FALSE FALSE
#> [3,] FALSE FALSE
#> [4,] FALSE FALSE
dfr %>% replace_with_na(replace = list(A = "<NA>")) %>% is.na()
#> A B
#> [1,] FALSE FALSE
#> [2,] FALSE FALSE
#> [3,] TRUE FALSE
#> [4,] FALSE FALSE
# You can also specify how to do this for many variables
dfr %>% replace_with_na_all(~.x == "<NA>")
#> # A tibble: 4 x 2
#> A B
#> <int> <int>
#> 1 2 1
#> 2 3 2
#> 3 NA 3
#> 4 4 4
You can read more about using replace_with_na
here

Nick Tierney
- 192
- 1
- 8