1

I try to write a program that stores non-equal pair-objects, consisting of 2 Strings. For that matter, the pair (john, bob) is considered equal to (bob, john). My equals and compareTo implementations should work fine. To check whats wrong I let my program output the comparisions made for each new Pair I try to add. Looks like this:

@Override
public boolean equals(Object o){
    if (o==null){
        return false;
    }
    final Pair other = (Pair) o;
    return (this.compareTo(other)==0);
}



@Override
public int compareTo (Pair o){
  if (this.first.equals(o.first)){
      if (this.second.equals(o.second)){
          System.out.println("equal: "+this.first+" "+this.second+" and  " + o.first+" "+o.second);
          return 0;
      }
  }
  else if (this.first.equals(o.second)){
        if (this.second.equals(o.first)){
            System.out.println("equal: "+this.first+" "+this.second+" and  " + o.first+" "+o.second);
            return 0;
        }
  }
    System.out.println(" not equal " +this.first+" "+this.second+" and  " + o.first+" "+o.second);
  return -1;

Exampleinput:

 bob john
 john john
 john john
 john bob
 bob will
 john hohn

If I let this run it will print out the size of the TreeSat after each trial to add a new element. It will also print what is written in the compareTo method. I added comments to specifify my problem.

   equal: bob john and  bob john    //Why comparing the first element at  
                                      all?
1
 not equal john john and  bob john
2
 not equal john john and  bob john
equal: john john and  john john
2
equal: john bob and  bob john
2
 not equal bob will and  bob john
 not equal bob will and  john john
3

 not equal john hohn and  john john    //no comparision of (john hohn) and                                       
 not equal john hohn and  bob will     //(bob john) why?
4
Louis
  • 89
  • 11
  • 1
    That's the advantage of tree based structures - you actually don't have to visit every element to perform some operation. Thanks to that, you can get to sublinear (to be specific, logarithmic) complexity. – Ondra K. Apr 05 '19 at 09:54
  • 2
    You are never returning `1` in your `compareTo`, which means that the object passed to it is equal or smaller, but never bigger – Lino Apr 05 '19 at 09:54
  • @OndraK. what? elements are always supposed to be compared when inserted – Eugene Apr 05 '19 at 09:55
  • 1
    @Eugene if you're inserting to BST, you definitely do not need to compare all elements of the collection to find the correct place for it, do you? – Ondra K. Apr 05 '19 at 09:57
  • @OndraK. *to find* != *to insert* is it? – Eugene Apr 05 '19 at 09:58
  • @Eugene It indeed is almost equal in case of BST - you first try to _search_ for the element, and then, after you hit a leaf node, you add it as a child on the place you've found. – Ondra K. Apr 05 '19 at 09:59
  • @OndraK. I think we are talking about the same thing here, actually. – Eugene Apr 05 '19 at 10:01
  • @Lino does that matter since I dont want to sort the elements? I just want to decide whether they already exist in my set and if so not add them – Louis Apr 05 '19 at 10:33
  • @OndraK I can comprehend why I wouldnt need to compare a new element to each element if I would hava an order. But since its only about: This element already exist or it doesnt I feel like every single comparision should be made. Also: With a longer input I have elements added to the list that are duplicates of already existing elements. – Louis Apr 05 '19 at 10:41
  • 1
    TreeSet is sorted, you cannot turn that off. You supplied it a compareTo that doesn't work properly for sorting, so to quote the API doc "it just fails to obey the general contract of the Set interface". –  Apr 05 '19 at 10:55
  • I see, thanks. That would mean for my program, it would be a much smarter choice to use HashSet because it will not sort and hence compare a new element to all elements already in the set. If that is correct, what still confuses me: Why is HashSet faster than TreeSet even tho it has to do more comparisions before adding a new element? (I got the information that it is faster from here: https://stackoverflow.com/questions/1463284/hashset-vs-treeset) – Louis Apr 05 '19 at 12:24
  • 1
    HashSet will not typically compare everything, if you have a good hashCode (no collisions), it is supposed to do only one comparison. A collision is two or more non-equal elements sharing the same hashCode value, those are then told apart by equals. –  Apr 05 '19 at 12:33
  • 1
    If you want a HashSet that compares everything all the time, just make a hashCode that returns 0. Everything will be collisions and will have to be resolved by equals... –  Apr 05 '19 at 12:35
  • Alright, that means checking if 2 elements have a different hashcode is way faster than checking whether they are equal or not? Im asking because I thought at least hashcode has to compare the hashcodes of the elements – Louis Apr 05 '19 at 12:37
  • 1
    Hashcode tells you the address where to store/look for the element directly, you just have to double-check that you got the right one. And a good hashing function will be as random as possible and changing one bit of the key will give you a wildly different address. –  Apr 05 '19 at 12:45
  • 1
    https://en.wikipedia.org/wiki/Hash_table <- explanation of hash tables –  Apr 05 '19 at 12:49
  • 1
    So making a hashing function that says "just dump everything into bucket 0" is a fairly horrible idea and you will lose the benefit of hashing, but it is still guaranteed to work correctly, if slowly... –  Apr 05 '19 at 12:51

1 Answers1

4

ONE: To answer your question: TreeSet does not need to compare all elements, because there is a defined order for the elements. Consider a dictionary: open it in the middle and you will instantly know, if the word you need is before or after that page. You won't need to check both halfs of the dictionary.

TWO: Your compareTo() method is faulty. Consider two objects:

Pair a = Pair.of(1, 2);
Pair b = Pair.of(3, 4);

Your compareTo() will return -1 in both cases, wich it must not:

a.compareTo(b) == -1
b.compareTo(a) == -1

Mathematically spoken your relation "compareTo" does not define an order and thus violates the API contract:

The implementor must ensure sgn(x.compareTo(y)) == -sgn(y.compareTo(x)) for all x and y.

Amadán
  • 718
  • 5
  • 18