How to sort "equal" floating point values

Question

I need a sort function that treats numbers that would be equal using all.equal() as if they are equal.

For instance, if you do:

library(plyr)
a = sample(c(0.8, 0.7), 30, replace=TRUE)
b = sample(c(1.1, 1.2), 30, replace=TRUE)
df = data.frame(a)
df$b = b
df$sum = a + b
arrange(df, desc(sum))

All pairs of (0.8, 1.1) will sort above pairs of (0.7, 1.2), which is not what I want--I want the random order to be preserved within the category of things that sum to 1.9.

This is happening because

> 1.1 + 0.8 > 1.2 + 0.7
[1] TRUE

and

> 1.1 + 0.8 == 1.2 + 0.7
[1] FALSE

I understand that this is a consequence of how floating point numbers work, and that R has a function all.equal() to test for "true" equality. For example

> all.equal(0.8+1.1, 0.7+1.2)
[1] TRUE

So I'm looking for a sort function or a way to sort that behaves as all.equal() does and not as == does.

Edited to make clear this is not a duplicate of other questions.

Add a small random noise to all values with a standard deviation a couple of magnitudes smaller than what you see in your list of numbers? That should produces random orders within groups — ekstroem, Jul 16 '17 at 17:18
Do not use comparison operators with floating point numbers, the results will not be dependable. — Pierre L, Jul 16 '17 at 17:20
@ekstroem That could work, wouldn't be my first choice, but I will resort to it (get it?) if I must. — Ben S., Jul 16 '17 at 17:39
@PierreLafortune please remove the exact duplicate tag, I rewrote the question — Ben S., Jul 16 '17 at 17:40
Reopened. But this has nothing to do with floating point now, remove the decimals using `8+11` and `7+12` and you have the same issue. The answer is to arrange by two columns `arrange(df, desc(sum), a)` — Pierre L, Jul 16 '17 at 17:48
Hi Pierre, not sure what you mean. If I take the decimals out of that code and run it, I see that the values that sum to a 19 are a mix of 8+11 and 7+12 (i.e. the random order is preserved), so the issue does indeed go away (try it yourself). Anyway thanks for removing tag. — Ben S., Jul 16 '17 at 17:52
Round the sum `df$sum = round(a + b, 5)` and it will behave the way you are expecting. — Pierre L, Jul 16 '17 at 18:44
I think that using `signif` rather than `round` would be better. But I approve the previous suggestion. — F. Privé, Jul 17 '17 at 11:18
round displaces the problem at midpoint boundary, for example `9.001+0.004` and `9.002+0.003` lie at different sides of `9005/1000`. But if you are sure that there is no input close to such boundary (like all your inputs are with at most 4 decimal places after fraction point, and you round at fifth decimal place) then it's a great solution. — aka.nice, Jul 17 '17 at 13:49

How to sort "equal" floating point values

0 Answers0