You're correct that your compareTo()
method is now inconsistent. It violates several of the requirements for this method. The compareTo()
method must provide a total order over the values in the domain. In particular, as mentioned in the comments, a.compareTo(b) < 0
must imply that b.compareTo(a) > 0
. Also, a.compareTo(a) == 0
must be true for every value.
If your compareTo()
method doesn't fulfil these requirements, then various pieces of the API will break. For example, if you sort a list containing an UNKNOWN
value, then you might get the dreaded "Comparison method violates its general contract!" exception.
How does this square with the SQL requirement that null
values aren't equal to each other?
For SQL, the answer is that it bends its own rules somewhat. There is a section in the Wikipedia article you cited that covers the behavior of things like grouping and sorting in the presence of null
. While null
values aren't considered equal to each other, they are also considered "not distinct" from each other, which allows GROUP BY
to group them together. (I detect some specification weasel wording here.) For sorting, SQL requires ORDER BY
clauses to have additional NULLS FIRST
or NULLS LAST
in order for sorting with nulls to proceed.
So how does Java deal with IEEE 754 NaN
which has similar properties? The result of any comparison operator applied to NaN
is false. In particular, NaN == NaN
is false. This would seem to make it impossible to sort floating point values, or to use them as keys in maps. It turns out that Java has its own set of special cases. If you look at the specifications for Double.compareTo()
and Double.equals()
, they have special cases that cover exactly these situations. Specifically,
Double.NaN == Double.NaN // false
Double.valueOf(Double.NaN).equals(Double.NaN) // true!
Also, Double.compareTo()
is specified so that it considers NaN
equal to itself (it is consistent with equals) and NaN
is considered larger than every other double
value including POSITIVE_INFINITY
.
There is also a utility method Double.compare(double, double)
that compares two primitive double
values using these same semantics.
These special cases let Java sorting, maps, and so forth work perfectly well with Double
values, even though this violates IEEE 754. (But note that primitive double
values do conform to IEEE 754.)
How should this apply to your Tag
class and its UNKNOWN
value? I don't think you need to follow SQL's rules for null
here. If you're using Tag
instances in Java data structures and with Java class libraries, you'd better make it conform to the requirements of the compareTo()
and equals()
methods. I'd suggest making UNKNOWN
equal to itself, to have compareTo()
be consistent with equals, and to define some canonical sort order for UNKNOWN
values. Usually this means sorting it higher than or lower than every other value. Doing this isn't terribly difficult, but it can be subtle. You need to pay attention to all the rules of compareTo()
.
The equals()
method might look something like this. Fairly conventional:
public boolean equals(Object obj) {
if (this == obj) {
return true;
}
return obj instanceof Tag && id.equals(((Tag)obj).id);
}
Once you have this, then you'd write compareTo()
in a way that relies on equals()
. (That's how you get the consistency.) Then, special-case the unknown values on the left or right-hand sides, and finally delegate to comparison of the id
field:
public int compareTo(Tag o) {
if (this.equals(o)) {
return 0;
}
if (this.equals(UNKNOWN)) {
return -1;
}
if (o.equals(UNKNOWN)) {
return 1;
}
return id.compareTo(o.id);
}
I'd recommend implementing equals()
, so that you can do things like filter UNKNOWN
values of a stream, store it in collections, and so forth. Once you've done that, there's no reason not to make compareTo
consistent with equals
. I wouldn't throw any exceptions here, since that will just make standard libraries hard to use.