2

I have an object that represents an UNKNOWN value, or "Null object".

As in SQL, this object should never be equal to anything, including another UNKNOWN, so (UNKNOWN == UNKNOWN) -> false.

The object is used, however, in hashtables and the type is Comparable, so I created a class as follows:

public class Tag implements Comparable {    
  final static UNKNOWN = new Tag("UNKNOWN");
  String id;

  public Tag(String id) { 
    this.id = id; 
  }

  public int hashCode(){
    return id.hashCode();
  }

  public String toString(){
    return id;
  }

  public boolean equals(Object o){    
    if (this == UNKNOWN || o == UNKNOWN || o == null || !(o instanceof Tag))
      return false;

    return this.id.equals(((Tag)o).id);
  }

  public int compareTo(Tag o){
    if (this == UNKNOWN)
      return -1;
    if (o == UNKNOWN || o == null)
      return 1;

    return this.id.compareTo(o.id);
  }
}

But now compareTo() seems "inconsistent"?

Is there a better way to implement compareTo()?

GhostCat
  • 137,827
  • 25
  • 176
  • 248
isapir
  • 21,295
  • 13
  • 115
  • 116
  • 6
    If two objects can be identical in value and yet still not be equal, then they cannot be `Comparable`, as it would break the contract. – 4castle May 05 '17 at 03:13
  • 1
    It's not just inconsistent, it's wrong. If `a.compareTo(b) < 0`, it's required that `b.compareTo(a) > 0`. – shmosel May 05 '17 at 03:19
  • That's my worry exactly, so how do I implement a `compareTo()` with an UNKNOWN object that can not be equal other UNKNOWN objects? – isapir May 05 '17 at 03:20
  • https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem – e4c5 May 05 '17 at 03:22
  • 1
    What would SQL do? – shmosel May 05 '17 at 03:24
  • logically you can not make something not equal to itself, that just does not make any sense –  May 05 '17 at 04:00
  • 1
    Have you ever heard of the SQL Specification? https://en.wikipedia.org/wiki/Null_(SQL) – isapir May 05 '17 at 04:03
  • 2
    @JarrodRoberson It's actually quite common. JavaScript has it also with `NaN`, where all attempts to sort an array containing `NaN` have undefined outputs, and all comparisons with `NaN` always return `false`. – 4castle May 05 '17 at 04:07
  • but you can not use `NaN` as a key in a HashTable because you could never retrieve the value associated with it because you could never match it, you kind of prove my point there in your comment –  May 05 '17 at 04:21
  • @JarrodRoberson I'm just saying that "not logical" doesn't mean "not allowed." It means that you have to create a well defined specification of what operations will have undefined output. – 4castle May 05 '17 at 04:22
  • Duplicate -> http://stackoverflow.com/questions/25932730/hashmap-with-null-key-and-null-value and http://stackoverflow.com/questions/11981852/why-hashtable-does-not-allows-null-keys-or-values –  May 05 '17 at 04:47
  • 1
    I don't see the big deal. Choose an order (null = 0, nulls first, nulls last etc.) and implement it. It won't be consistent with equals, but that's not a strict requirement. As long you're aware of the caveats (e.g. it'll work as key for `TreeMap`, but not `HashMap`), it's your decision. – shmosel May 05 '17 at 04:48
  • 1
    Close-voter: this is not an opinion-based question. It's a serious design question. See my answer for information on how SQL and Java deal with such things. http://stackoverflow.com/a/43796827/1441122 – Stuart Marks May 05 '17 at 05:01
  • The `Tag` class needs to implement `Comparable`. You should also probably make `id` be `final`. – Stuart Marks May 05 '17 at 05:23

4 Answers4

3

The documentation for compareTo mentions this situation:

It is strongly recommended, but not strictly required that

(x.compareTo(y)==0) == (x.equals(y))

Generally speaking, any class that implements the Comparable interface and violates this condition should clearly indicate this fact. The recommended language is "Note: this class has a natural ordering that is inconsistent with equals."

Therefore, if you want your object to be Comparable and yet still not allow two UNKNOWN objects to be equal via the equals method, you must make your compareTo "Inconsistent with equals."

An appropriate implementation would be:

public int compareTo(Tag t) {
    return this.id.compareTo(t.id);
}

Otherwise, you could make it explicit that UNKNOWN values in particular are not Comparable:

public static boolean isUnknown(Tag t) {
    return t == UNKNOWN || (t != null && "UNKNOWN".equals(t.id));
}

public int compareTo(Tag t) {
    if (isUnknown(this) || isUnknown(t)) {
        throw new IllegalStateException("UNKNOWN is not Comparable");
    }
    return this.id.compareTo(t.id);
}
Community
  • 1
  • 1
4castle
  • 32,613
  • 11
  • 69
  • 106
  • I think that you're on to something here. I don't want the IllegalStateException -- if I had wanted to risk a Runtime Exception then I would keep the undefined value as a regular Java `null` and get a NullPointerException. But the quote that you cited from the compareTo() documentation might be what I was looking for. I'm upvoting your answer for now, and will accept it if deemed the best solution. – isapir May 05 '17 at 04:42
  • I agree that this is *one* way to go: but you have to understand that you are about to open a can of worms. I would rather step back quite a bit; and rethink your whole approach. Will update my answer on that part in a second. – GhostCat May 05 '17 at 06:38
3

You're correct that your compareTo() method is now inconsistent. It violates several of the requirements for this method. The compareTo() method must provide a total order over the values in the domain. In particular, as mentioned in the comments, a.compareTo(b) < 0 must imply that b.compareTo(a) > 0. Also, a.compareTo(a) == 0 must be true for every value.

If your compareTo() method doesn't fulfil these requirements, then various pieces of the API will break. For example, if you sort a list containing an UNKNOWN value, then you might get the dreaded "Comparison method violates its general contract!" exception.

How does this square with the SQL requirement that null values aren't equal to each other?

For SQL, the answer is that it bends its own rules somewhat. There is a section in the Wikipedia article you cited that covers the behavior of things like grouping and sorting in the presence of null. While null values aren't considered equal to each other, they are also considered "not distinct" from each other, which allows GROUP BY to group them together. (I detect some specification weasel wording here.) For sorting, SQL requires ORDER BY clauses to have additional NULLS FIRST or NULLS LAST in order for sorting with nulls to proceed.

So how does Java deal with IEEE 754 NaN which has similar properties? The result of any comparison operator applied to NaN is false. In particular, NaN == NaN is false. This would seem to make it impossible to sort floating point values, or to use them as keys in maps. It turns out that Java has its own set of special cases. If you look at the specifications for Double.compareTo() and Double.equals(), they have special cases that cover exactly these situations. Specifically,

Double.NaN == Double.NaN // false

Double.valueOf(Double.NaN).equals(Double.NaN) // true!

Also, Double.compareTo() is specified so that it considers NaN equal to itself (it is consistent with equals) and NaN is considered larger than every other double value including POSITIVE_INFINITY.

There is also a utility method Double.compare(double, double) that compares two primitive double values using these same semantics.

These special cases let Java sorting, maps, and so forth work perfectly well with Double values, even though this violates IEEE 754. (But note that primitive double values do conform to IEEE 754.)

How should this apply to your Tag class and its UNKNOWN value? I don't think you need to follow SQL's rules for null here. If you're using Tag instances in Java data structures and with Java class libraries, you'd better make it conform to the requirements of the compareTo() and equals() methods. I'd suggest making UNKNOWN equal to itself, to have compareTo() be consistent with equals, and to define some canonical sort order for UNKNOWN values. Usually this means sorting it higher than or lower than every other value. Doing this isn't terribly difficult, but it can be subtle. You need to pay attention to all the rules of compareTo().

The equals() method might look something like this. Fairly conventional:

public boolean equals(Object obj) {
    if (this == obj) {
        return true;
    }

    return obj instanceof Tag && id.equals(((Tag)obj).id);
}

Once you have this, then you'd write compareTo() in a way that relies on equals(). (That's how you get the consistency.) Then, special-case the unknown values on the left or right-hand sides, and finally delegate to comparison of the id field:

public int compareTo(Tag o) {
    if (this.equals(o)) {
        return 0;
    }

    if (this.equals(UNKNOWN)) {
        return -1;
    }

    if (o.equals(UNKNOWN)) {
        return 1;
    }

    return id.compareTo(o.id);
}

I'd recommend implementing equals(), so that you can do things like filter UNKNOWN values of a stream, store it in collections, and so forth. Once you've done that, there's no reason not to make compareTo consistent with equals. I wouldn't throw any exceptions here, since that will just make standard libraries hard to use.

Stuart Marks
  • 127,867
  • 37
  • 205
  • 259
2

The simple answer is: you shouldn't.

You have contradiction requirements here. Either your tag objects have an implicit order (that is what Comparable expresses) OR you can have such "special" values that are not equal to anything, not even themselves.

As the other excellent answer and the comments point out: yes, you can somehow get there; for example by simply allowing for a.compare(b) < 0 and b.compare(a) < 0 at the same time; or by throwing an exception.

But I would simply be really careful about this. You are breaking a well established contract. And the fact that some javadoc says: "breaking the contract is OK" is not the point - breaking that contract means that all the people working on this project have to understand this detail.

Coming from there: you could go forward and simply throw an exception within compareTo() if a or b are UNKNOWN; by doing so you make at least clear that one shouldn't try to sort() a List<Tag> for example. But hey, wait - how would you find out that UNKNOWN is present in your list? Because, you know, UNKNOWN.equals(UNKNOWN) returns false; and contains() is using equals.

In essence: while technically possible, this approach causes breakages wherever you go. Meaning: the fact that SQL supports this concept doesn't mean that you should force something similar into your java code. As said: this idea is very much "off standards"; and is prone to surprise anybody looking at it. Aka "unexpected behavior" aka bugs.

GhostCat
  • 137,827
  • 25
  • 176
  • 248
  • The documentation gets away with this issue by stating that `compareTo(null)` throws a `NullPointerException`. I would find it reasonable that an `UNKNOWN` value would be a similar case where it's necessary to throw an exception. – 4castle May 05 '17 at 04:19
  • SQL has it both ways. It's a little unintuitive, but it works. – shmosel May 05 '17 at 04:42
  • Both comments make sense; I tried to factor them into my answer. Thank you @4castle and Mr. shmosel! – GhostCat May 05 '17 at 06:36
  • I was not suggesting that it would acceptable to allow *for a.compare(b) < 0 and b.compare(a) < 0 at the same time*. In fact, I already [commented](http://stackoverflow.com/questions/43795877/java-null-object-that-is-used-in-hash-tables-and-comparables/43796013#comment74632339_43795877) to the contrary. I meant that having a value not equal to itself does not, logically or contractually, preclude implicit ordering. – shmosel May 08 '17 at 03:24
0

A couple seconds of critical thinking:

There is already a null in Java and you can not use it as a key for a reason.

If you try and use a key that is not equal to anything else including itself you can NEVER retrieve the value associated with that key!

Community
  • 1
  • 1
  • 1
    I don't want ever to retrieve the value associated with "that key". The value is `null` or `undefined`. – isapir May 05 '17 at 04:14
  • then why do you want to put it in a `Hashtable` to begin with then, it makes no sense and not a very well thought out solution. This is an X/Y Problem at best. –  May 05 '17 at 04:36
  • Because all other values, i.e. all the values that are KNOWN, should be retrieved fast. The UNKNOWN value is an edge case. – isapir May 05 '17 at 04:40
  • you miss my entire point, if the value associated with it is `null` or `undefined` and will never be access why even put in the container? There is a reason that `null` is not allowed to be a key, that reason applies to your X/Y Problem. –  May 05 '17 at 04:45