0

I am trying to use the function na_if from the package dplyr in order to replace a certain value with NAs from data frame. For example:

> nyc_ci <- data.frame(nyc_ci_lower, nyc_ci_upper) # Creating a data frame with 2 variables
> dput(nyc_ci[1:10,])
structure(list(nyc_ci_lower = c(0.126589039921449, 0.126589039921449, 
0.126589039921449, 0.126589039921449, 0.126589039921449, 
0.126589039921449, 0.126589039921449, 0.126589039921449, 
0.126589039921449, 0.126589039921449), nyc_ci_upper = 
c(18.4443972705697, 18.4443972705697, 18.4443972705697, 
18.4443972705697, 18.4443972705697, 18.4443972705697, 
18.4443972705697, 18.4443972705697, 18.4443972705697, 
18.4443972705697)), row.names = c(NA, 10L), class = "data.frame")

> nyc_ci_lower_na <- na_if(nyc_ci$nyc_ci_lower, 0.126589039921449) # Attempting to replace 0.126589039921449 with NA 
> dput(nyc_ci_lower_na[1:10])
  c(0.126589039921449, 0.126589039921449, 0.126589039921449, 
  0.126589039921449, 0.126589039921449, 0.126589039921449, 
  0.126589039921449, 0.126589039921449, 0.126589039921449, 
  0.126589039921449)

However, when I do this, the value in question does not get replaced by NAs. I did this once before with a column from another data frame and it worked fine. Is there anything I should do differently?

lhn24
  • 1
  • 1
  • Welcome to SO! Without the data, it is hard to find the error. Try to include your data with `dput`, have a look [here](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). My guess is that it is a floating point that has actually more digits, so `0.1265890` is not precise enough – starja Jul 09 '20 at 15:16
  • Thanks! I made edits based on your suggestions. However, it still does not work even with the more precise value. Any other tips? @starja – lhn24 Jul 09 '20 at 15:29
  • Perhaps you should round or truncate the values to get rid off the digits that are not shown. – Martin Gal Jul 09 '20 at 15:45
  • With your edits, the code you have works for me. As @starja and @Martin Gal are mentioning, maybe it's about floating point precision? The [`near`](https://dplyr.tidyverse.org/reference/near.html) function from dplyr can help if that's the cause. – ravic_ Jul 09 '20 at 17:35

0 Answers0