37

Can we find the hashcode of a list that contains itself as element?

I know this is a bad practice, but this is what the interviewer asked.

When I ran the following code it throws a StackOverflowError:

public class Main {
    public static void main(String args[]) {
        ArrayList<ArrayList> a = new ArrayList();
        a.add(a);
        a.hashCode();
    }
}

Now here I have two questions:

  1. Why is there a StackOverflowError?
  2. Is it possible to find the hash code in this way?
Holger
  • 285,553
  • 42
  • 434
  • 765
T-Bag
  • 10,916
  • 3
  • 54
  • 118
  • 7
    Because you add the List to itself. try a.hashCode() without the add statement – Jens Jan 07 '20 at 06:46
  • When you put an object into an arraylist, you are storing the reference to the object.In your case you are putting an ArrayList witch is itself reference. – Vishwa Ratna Jan 07 '20 at 06:52
  • Similar : https://stackoverflow.com/questions/42566559/arraylist-of-arraylists-adding-and-retrieving-elements – Vishwa Ratna Jan 07 '20 at 06:53
  • Ok, I got why there is stackoverflow , can some one help me explain the problem number 2- How to find this – T-Bag Jan 07 '20 at 07:10
  • 9
    As others have answered, this isn't possible, by the very definition of the `List` interface, the `hashCode` of a list depends on its members. Given that the list is its own member, it's hash code depends on its `hashCode`, which depends on its `hashCode`... and so on, causing infinite recursion and the `StackOverflowError` you're running into. Now the question is: *why* do you require a list to contain itself? I can guarantee you that you can achieve whatever it is that you're trying to do, in a better way, without requiring recursive membership like this. – Alexander Jan 07 '20 at 18:56
  • [tag:hashcode] ? – David Cary Jan 12 '20 at 22:36

6 Answers6

36

The hash code for conforming List implementations has been specified in the interface:

Returns the hash code value for this list. The hash code of a list is defined to be the result of the following calculation:

 int hashCode = 1;
 for (E e : list)
     hashCode = 31*hashCode + (e==null ? 0 : e.hashCode());

This ensures that list1.equals(list2) implies that list1.hashCode()==list2.hashCode() for any two lists, list1 and list2, as required by the general contract of Object.hashCode().

This doesn’t require that the implementation looks exactly like that (see How to compute the hash code for a stream in the same way as List.hashCode() for an alternative), but the correct hash code for a list only containing itself would be a number for which x == 31 + x must be true, in other words, it is impossible to calculate a conforming number.

Holger
  • 285,553
  • 42
  • 434
  • 765
  • 1
    @Holger, Eirc wants to replace the code of the entire function `hashCode()` to return `0`. This technically solves `x == 31 + x` but ignores the requirement that x must be at least 1. – bxk21 Jan 07 '20 at 20:56
  • 4
    @EricDuminil the point of my answer is, the contract describes a logic which `ArrayList` implements literally, leading to a recursion, but there's no conforming alternative implementation either. Note that I posted my answer at a time when the OP already understood why this particular implementation leads to a `StackOverflowError`, which has been addressed in other answers. So I focused on the general impossibility of a conforming implementation completing in finite time with a value. – Holger Jan 07 '20 at 21:31
  • 1
    I'd say it's not true that you can't conforming specification of Object: the hashCode has to be equals if the list are equals, but the reverse is obviously not mandatory, you can have collision in hashcode compution, so in this case I would just stop the recursion by ignoring the self-containing list. It can be complex if the cycle is on more than one object. Of course we cannot conform to the "specification" of the list that impose a non computable result, but it is strange to me to have a specification with an implementation, the specification for "object" should be sufficient. – pdem Jan 08 '20 at 08:59
  • @pdem the `List` *interface* is not an implementation but a contract. And there’s nothing strange for having a specification of `hashCode()` there, as we also have a specification for `equals` there, which is necessary for the interoperability between all `List` implementations. In other words, when you have a list containing itself, then `list.get(0).equals(list)` will be `true`, but also `List.of(list).equals(list)` will be `true`. Therefore, `list.hashCode()`, `list.get(0).hashCode()`, and `List.of(list).hashCode()` would all have to evaluate to the same value. – Holger Jan 08 '20 at 09:21
  • 1
    I'am sorry I dont want to make a discussion in comment: my point is, that I criticate the Sun specification content (yes a bit pretentious). We need a specification for list, but an interface is a contract and the specification shouldn't contains an implementation, and you prefectly express that in this way it is impossible to implement a List that contains itself with a correct hashcode. it's always an error in IT methodology to drop the analysis and only give the implementation, like if an architect directly make the concrete of the building without any plan. – pdem Jan 08 '20 at 09:29
  • 2
    @pdem it doesn’t matter whether the specification uses a wordy description of an algorithm, a formula, pseudo code, or actual code. As said in the answer, the code given in the spec does not preclude alternative implementations in general. The form of the spec doesn’t say anything about whether analysis happened or not. The interface documentation’s sentence“*While it is permissible for lists to contain themselves as elements, extreme caution is advised: the equals and hashCode methods are no longer well defined on such a list*” suggests that such an analysis did happen. – Holger Jan 08 '20 at 11:16
  • Well it does matter, we won't agree on this point. The irony is that you prefectly demonstrate how you can block future implementation of the list be beeing too specific on the code: it serve nothing to impose that hashcode=31+something, But the important point of the specification is that the hashcode is computed from its elements, I would just ignore the specification if I had to implement the list, or maybe create an implementation of Collection that doesn't have this constraint. – pdem Jan 08 '20 at 14:06
  • 1
    @pdem you are still ignoring the symmetry requirement of `equals`, which implies that all list implementations must agree on whether they are equal. As already explained, lists are defined to be equal when they contain equal elements which implies that your custom list still must be equal to `List.of(list)` hence, it also must have the same hash code. Since you have no control over the list implementation returned by `List.of(…)`, or any other list implementation other than yours, the only way to fulfill the general contract of `hashCode()`, is to have a very specific definition of it. – Holger Jan 08 '20 at 14:32
  • I'm not ignoring, I said it at first comment " the hashCode has to be equals if the list are equals" but the list don't need to be equals if the hashcode are equals, beacause again as I said `you can have collision in hashcode computation`. If their werent the specification, we could perfectly compute a non infinite loop recursive hashcode that respect the equals, the "always 0" was an example, a more optimal one would be to return 0 inside the loop each time a cycle is detected. – pdem Jan 08 '20 at 15:11
  • 2
    @pdem you are reversing it. I never said the that the list have to be equal because of the hash code. The lists **are** equal, by definition. `Arrays.asList("foo")` is equal to `Collections.singletonList("foo")`, is equal to `List.of("foo")` is equal to `new ArrayList<>(List.of("foo"))`. All these lists are equal, point. Then, since these lists are equal, they must have the same hash code. Not the other way round. Since they must have the same hash code, it must be well defined. Regardless of how they defined it (back in Java 2), it must be well defined, to be agreed by all implementations. – Holger Jan 08 '20 at 15:27
  • OK I see it, to make it simple, `ArrayList` must be equals to any other types including `Collections/singletonList`, was missing that. I was implying the equal/hashcode duality of the same type implementation. Then again, if needed, I would go with a custom implementation of Collection, since it's impossible by specification to use a List. – pdem Jan 08 '20 at 17:01
  • 2
    @pdem exactly, a custom implementation which either, doesn’t implement `List` or has a big warning sign “do not mix with ordinary lists”, compare with `IdentityHashMap` and its “**This class is *not* a general-purpose Map implementation!**” warning. In the former case, you are already fine with implementing `Collection` but adding the list style index based access methods as well. Then, no equality constraint exists, but it still works smoothly with other collection types. – Holger Jan 08 '20 at 17:28
23

Check out the skeletal implementation of the hashCode method in AbstractList class.

public int hashCode() {
    int hashCode = 1;
    for (E e : this)
        hashCode = 31*hashCode + (e==null ? 0 : e.hashCode());
    return hashCode;
}

For each element in the list, this calls hashCode. In your case list has itself as it's sole element. Now this call never ends. The method calls itself recursively and the recursion keeps winding until it encounters the StackOverflowError. So you can not find the hashCode this way.

Ravindra Ranwala
  • 20,744
  • 6
  • 45
  • 63
14

You have defined a (pathological) list that contains itself.

Why there is StackOverflowError ?

According to the javadocs (i.e. the specification), the hashcode of a List is defined to a function of the hashcode of each of its elements. It says:

"The hash code of a list is defined to be the result of the following calculation:"

int hashCode = 1;
    for (E e : list)
         hashCode = 31*hashCode + (e==null ? 0 : e.hashCode());

So to compute the hashcode of a, you first compute the hashcode of a. That is infinitely recursive and leads quickly to a stack overflow.

Is it possible to find hash code in this way ?

No. If you consider the algorithmic specification above in mathematical terms, the hashcode of a List that contains itself is a non-computable function. It is not possible to compute it this way (using the above algorithm) or any other way.

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
  • I don't know why this answer is lower than the two other since it actually respond to the OP questions with explanations. – Neyt Jan 08 '20 at 09:44
9

No, the documentation has an answer

The documentation of the List structure explicitly states:

Note: While it is permissible for lists to contain themselves as elements, extreme caution is advised: the equals and hashCode methods are no longer well defined on such a list.

There's not much more to say beyond that - according to the Java specification, you won't be able to calculate hashCode for a list that contains itself; other answers go into detail why it's so, but the point is that it is known and intentional.

Community
  • 1
  • 1
Peteris
  • 3,281
  • 2
  • 25
  • 40
  • 1
    You did tell why it's outside the specification, so it explains that it's not a bug. That was the part missing from the other answers. – pdem Jan 08 '20 at 08:48
3

Ravindra's answer gives a good explanation for point 1. To comment on question 2:

Is it possible to find hash code in this way?

Something is circular here. At least one of these 2 must be wrong in the context of this stack overflow error:

  • that the hash code of the list must take those of its elements into account
  • that it's OK for a list to be its own element

Now, because we're dealing with an ArrayList, the first point is fixed. In other words, maybe you need a different implementation to be able to meaningfully compute a hash code of a recursive list... One could extend ArrayList and skip adding the hash codes of elements, something like

for (E e : this)
  if(e == this) continue; //contrived rules
  hashCode = 31*hashCode + (e==null ? 0 : e.hashCode());

Using such a class instead of ArrayList, you could.

With ArrayList, the second point is wrong. So if the interviewer meant "Is it possible to find hash code in this way (with an array list)?", then the answer is no, because that's absurd.

ernest_k
  • 44,416
  • 5
  • 53
  • 99
  • 1
    The hash code calculation is mandated by [the `List` contract](https://docs.oracle.com/javase/8/docs/api/java/util/List.html#hashCode--). No valid implementation can skip itself. From the specification, you can derive that if you find an `int` number for which `x == 31 + x` is `true`, then you can implement a valid short-cut… – Holger Jan 07 '20 at 07:36
  • I didn't quite understand what @Holger was saying. But there is 2 Problems with solution: First: This solves the Problem only when this list is an item of itself and not if the list where an item of an item of itself (deeper layers of recursion) Second: If the List skipped itself it might be equal to an empty list. – Jonas Michel Jan 07 '20 at 16:09
  • 1
    @JonasMichel I didn't quite propose a solution. I'm just philosophically debating around the absurdity of question 2. If we *must* compute a hash code for such a list, then it won't work unless we remove the constraint of an array list (and Holger reinforces that by saying that `List` formally dictates the hash code logic to be complied with by implementations) – ernest_k Jan 07 '20 at 16:21
  • My (limited) understanding is that `List` provides a hash code calculation, but that we could override it. The only real requirement is that of general hashcodes: if the objects are equal, then the hashcodes must be equal. This implementation follows that requirement. – Teepeemm Jan 07 '20 at 19:32
  • 1
    @Teepeemm `List` is an interface. It defines a *contract*, it does *not* provide an implementation. Of course, the contract covers both, the equality and the hash code. Since the general contract of equality is that it has to be symmetric, if you change one list implementation, you would have to change all existing list implementations, otherwise, you would even break the fundamental contract of `java.lang.Object`. – Holger Jan 07 '20 at 20:32
0

Because when you call same function with from same function then it will create condition of recursion which never ends. And to prevent this operation JAVA will return java.lang.StackOverflowError

Below is example code which explains similar scenario:

public class RefTest {

    public static void main(String... strings) {
        RefTest rf = new RefTest();
        rf.message(message("test"));
    }

    public static String message2(String s){
        return message(s);
    }

    public static String message(String s){
        return message2(s);
    }

}   
Simmant
  • 1,477
  • 25
  • 39