0

I have read that == has trouble with floating points. eg:

good

74.2+53.2
[1] 127.4
74.2+53.2==127.4
[1] TRUE

bad

74.2+153.2
[1] 227.4
74.2+153.2==227.4
[1] FALSE

isTRUE and all.equal

isTRUE(all.equal(74.2+153.2, 227.4))
[1] TRUE

Using dplyr to group_by a variable, then summarise one variable as sum and the max of another, then look for equivalency.

First a working example: 74.2+53.2=127.4

library(dplyr)
df<-tibble(group=c(  1,     2,     2,   3),
           B=    c(1.0,  74.2,  53.2, 1.0),
           C=    c(  2, 127.4, 127.4, 1.0))
df %>% 
  group_by(group) %>% 
  summarise(sumB = sum(B),
            maxC = max(C)) %>% 
  mutate(equal = case_when(sumB== maxC ~ "yes"))
  group  sumB  maxC equal
  <dbl> <dbl> <dbl> <chr>
1     1    1     2  NA   
2     2  127.  127. yes  
3     3    1     1  yes

Now slightly larger numbers: 74.2+153.2=227.4

df<-tibble(group=c(1,2,2,3),
           B=c(1.0,74.2,153.2,1.0),
           C=c(2,227.4,227.4,1.0))

df %>% 
  group_by(group) %>% 
  summarise(sumB = sum(B),
            maxC = max(C)) %>% 
  mutate(equal = case_when(sumB== maxC ~ "yes"))
# A tibble: 3 x 4
  group  sumB  maxC equal
  <dbl> <dbl> <dbl> <chr>
1     1    1     2  NA   
2     2  227.  227. NA   
3     3    1     1  yes

Direct math

74.2+153.2==227.4
[1] FALSE

Now wrapped in isTRUE() and all.equal()

isTRUE(all.equal(74.2+153.2, 227.4))

[1] TRUE

Now a modified case_when()

df %>% 
  group_by(group) %>% 
  summarise(sumB = sum(B),
            maxC = max(C)) %>% 
  mutate(equal = case_when(isTRUE(all.equal(sumB, maxC)) ~ "yes"))
# A tibble: 3 x 4
  group  sumB  maxC equal
  <dbl> <dbl> <dbl> <chr>
1     1    1     2  NA   
2     2  227.  227. NA   
3     3    1     1  NA

No, this is not the expected result. What is the correct way to solve this?

Max Vollmer
  • 8,412
  • 9
  • 28
  • 43
M.Viking
  • 5,067
  • 4
  • 17
  • 33
  • 2
    related to https://stackoverflow.com/questions/9508518/why-are-these-numbers-not-equal – akrun Sep 29 '19 at 22:00
  • 1
    Thank you @akrun, the solution appears to be using the `near()` function. `mutate(equal = case_when(near(sumB, maxC) ~ "yes"))` – M.Viking Sep 29 '19 at 22:04
  • I rolled back your edit. Please do not add solutions into questions. – Max Vollmer Sep 30 '19 at 02:31
  • To reiterate, `isTRUE(all.equal())` doesn't work inside a `case_when()`, use `dplyr::near()` instead. ps. `near()` has a nice `tol = ` argument that allows for a wide range of tolerance. – M.Viking Oct 09 '19 at 14:18

0 Answers0