2

Can someone please explain why the setdiff function in R fails in the following example? Note that it does not matter if I use base R or dplyr.

#setdiff doesn't work
poly1 <- polynom::polynomial(c(-2,1))
poly2 <- polynom::polynomial(c(-4,0,1))
solve_poly1 <- solve(poly1)
solve_poly2 <- solve(poly2)
print(solve_poly2)
print(solve_poly1)
setdiff(solve_poly2, solve_poly1)
dplyr::setdiff(solve_poly2, solve_poly1)

#what is the structure?
str(solve_poly1)
str(solve_poly2)

#setdiff works
set1 <- c(2)
set2 <- c(-2,2)
setdiff(set2, set1)
dplyr::setdiff(set2, set1)

#what is the structure?
str(set1)
str(set2)

Here is my output:

> #setdiff doesn't work
> poly1 <- polynom::polynomial(c(-2,1))

> poly2 <- polynom::polynomial(c(-4,0,1))

> solve_poly1 <- solve(poly1)

> solve_poly2 <- solve(poly2)

> print(solve_poly2)
[1] -2  2

> print(solve_poly1)
[1] 2

> setdiff(solve_poly2, solve_poly1)
[1] -2  2

> dplyr::setdiff(solve_poly2, solve_poly1)
[1] -2  2

> #what is the structure?
> str(solve_poly1)
 num 2

> str(solve_poly2)
 num [1:2] -2 2

> #setdiff works
> set1 <- c(2)

> set2 <- c(-2,2)

> setdiff(set2, set1)
[1] -2

> dplyr::setdiff(set2, set1)
[1] -2

> #what is the structure?
> str(set1)
 num 2

> str(set2)
 num [1:2] -2 2

Based on the structure of my arrays, I cannot see why the polynomial roots are treated differently than the arrays declared manually.

NateA
  • 25
  • 2

1 Answers1

1

This is not an error with setdiff, but a floating point error in R. Because the polynomial is solved numerically, the values of solve_poly2 are not exactly -2 and 2.

> solve_poly1 - set1
[1] 0
> solve_poly2 - set2
[1] -4.440892e-16 -4.440892e-16

To avoid this problem, you can define a function to do the setdiff operation within a tolerance, which I modified from this related answer.

setdiff_tolerance <- function(a, b, tol = 1e-7) {
  a[sapply(a, function(x) !any(abs(x - b) <= tol))]
}

setdiff_tolerance(solve_poly2, solve_poly1)
[1] -2
qdread
  • 3,389
  • 19
  • 36