8

Based on various bad experiences my rule of thumb as a Java programmer is to only implement equals() and hashCode() on immutable objects, where two instances of the object really are interchangeable.

Basically I want to avoid situations like the HashMap key problem in that link, or like the following:

  1. Get a thing with a certain identity.
  2. Modify it.
  3. Add it to a set.
  4. (later) Get another thing with the same identity.
  5. Modify it.
  6. Add it to the same set.
  7. Fail to notice that this add doesn't actually happen, since the set thinks the thing is already there.
  8. Do something with the things in the set.
  9. Fail to notice that the change from step (5) is ignored, and we still have the state from step (2).

And by and large over the course of my Java career I haven't found a lot of use for equals() except for (1) value objects and (2) putting things into collections. I've also found that immutability + copy-and-modify constructors/builders is generally a much happier world than setters. Two objects might have the same ID and might represent the same logical entity, but if they have different data -- if they represent snapshots of the conceptual entity at different times -- then they're not equal().

Anyway, I'm now in a Hibernate shop, and my more Hibernate-savvy colleagues are telling me this approach isn't going to work. Specifically, the claim seems to be that in the following scenario --

  1. Hibernate loads a thing from the database -- we'll call it instance h1.
  2. This thing is marshaled and sent somewhere via a web service.
  3. The web service client fiddles with it and sends a modified version back.
  4. The modified version is unmarshalled on the server -- we'll call it instance h4.
  5. We want Hibernate to update the database with the modifications.

-- unless h1.equals(h4) (or perhaps h4.equals(h1), I'm not clear, but I would hope it's transitive anyway so whatever), Hibernate will not be able to tell that these are the same thing, and Bad Things Will Happen.

So, what I want to know:

  • Is this true?
  • If so, why? What is Hibernate using equals() for?
  • If Hibernate needs h1 and h4 to be equal, how does it (and how do we) keep track of which one is the modified version?

Note: I've read Implementing equals() and hashCode() in the Hibernate docs and it doesn't deal with the situation I'm worried about, at least directly, nor does it explain in any detail what Hibernate really needs out of equals() and hashCode(). Neither does the answer to equals and hashcode in Hibernate, or I wouldn't have bothered to post this.

Community
  • 1
  • 1
David Moles
  • 48,006
  • 27
  • 136
  • 235
  • "a much happier world than setters": agreed. I never implement a setter until I actually need it, and strive for immutable classes. – Raedwald Feb 01 '12 at 00:52
  • possible duplicate of [equals and hashcode in Hibernate](http://stackoverflow.com/questions/1638723/equals-and-hashcode-in-hibernate) – Don Roby Feb 01 '12 at 00:53
  • If two objects have the same identity and different data (as in your paragraph after the list of 9), you probably have a bug. If they represent the same object at different points in time, then they are indeed `equal()`, just like I would be equal to the same me from 10 years ago, even if I do have different attributes. I would expect the old me to replace the new me if he was added to a set after I was... – corsiKa Feb 01 '12 at 00:54
  • @glowcoder You would expect the old you to replace the new you if he was added to a set after you were, but if you're `equal()` to him, he won't. This isn't philosophy, it's Java. – David Moles Feb 01 '12 at 01:02
  • @DonRoby No, it isn't. That question's answer lists the official best practices but explains nothing about what's going on under the hood or why those practices are necessary. (I also read the documentation linked in the answer, as you'll see at the top of my question.) – David Moles Feb 01 '12 at 01:04
  • http://en.wikipedia.org/wiki/Identity_and_change : In general whether *X* is equivalent to *Y* has an element of semantic choice. But if you are using JPA/Hibernate you probably ought to be consistent with whatever semantics they impose. Whether you like it or not. I guess if you don't you wil get tricky corner cases just as tricky as the `HashSet`/`HashMap` problems. – Raedwald Feb 01 '12 at 01:04
  • @Raedwald That's what I'm asking: What semantics does JPA/Hibernate impose? (My coworkers can't actually explain; they're just passing on received wisdom.) – David Moles Feb 01 '12 at 01:05
  • Easy: Java semantics. The `equals` and `hashcode` contract. What constitutes equal and un-equal entities is up to the model and absolutely depends on the business requirements, including semantics for historical data, thats why there is no "automatic" equals method (like field-per-field-comparision). But you absolutely have to do it correct. You e.g. break automatic caching or collections if it is not done or done wrong. – Hauke Ingmar Schmidt Feb 01 '12 at 01:25
  • The contract just says that `equals()` has to be reflexive, symmetric, transitive and consistent, and that `equal` objects should have the same `hashCode`. This is provided by `Object`'s default implementation. There's nothing that says two objects which are business-equivalent have to be `equal`. What I'm asking is, is there anything in Hibernate that *requires* overriding `equals()`, or is it just, "if you override `equals()`, don't mess it up?" – David Moles Feb 01 '12 at 17:42
  • You cite the implementation details and formal necessities, not what the method is for, the semantics. One sentence before the implementation needs: "Indicates whether some other object is "equal to" this one.". Sure, the concepts of "identity" and "equality" are not explained there. With the default implementation of hashcode hashed collections are inefficient. With the default implementation of equals you can't retrieve objects other than by iterating or holding a second collection that holds keys. With the Apache Lang tools it is very simple to build good equals and hashcode. – Hauke Ingmar Schmidt Feb 01 '12 at 23:12
  • @his, I said, specifically, what semantics *does JPA/Hibernate impose*. This is not a philosophical question. It is not a question about general good practices when implementing `equals()` and `hashCode()`, nor about what you can do *in general* when you do vs. when you don't. It is a question about JPA/Hibernate. – David Moles Feb 03 '12 at 00:01
  • For an example of a similar imposition, see [SortedSet](http://docs.oracle.com/javase/6/docs/api/java/util/SortedSet.html), which requires comparison semantics (comparison consistent with `equals()`) that are "[strongly recommended (though not required)](http://docs.oracle.com/javase/1.4.2/docs/api/java/lang/Comparable.html)" for Java comparisons in general, if SortedSet is to obey the general contract of the Set interface. – David Moles Feb 03 '12 at 00:05

2 Answers2

5

First of all, your original idea, that you should implement equals() and hashCode() only on immutable objects, certainly works, but it's stricter than it needs to be. You just need these two methods to rely on immutable fields. Any field whose value may change is unsuitable for use in those two methods, but the other fields need not be immutable.

Having said that, Hibernate knows they're the same object by comparing their primary keys. This leads many people to write those two methods to rely on the primary key. Hibernate docs recommend you don't do it this way, but many people ignore this advice without much trouble. It means you can't add entities to a Set until after they've been persisted, which is a restriction that's not too hard to live with.

Hibernate docs recommend using a business key. But the business key should rely on fields that uniquely identify an object. The Hibernate docs say "use a business key that is a combination of unique, typically immutable, attributes." I use fields that have a unique constraint on them in the database. So, if your Sql CREATE TABLE statement specifies a constraint as

CONSTRAINT uc_order_num_item UNIQUE (order_num, order_item)

then those two fields can be your business key. That way, if you change one of them, both Hibernate and Java will treat the modified object as a different object. Of course, if you do change one of these "immutable" fields, you mess up any Set they belong to. So I guess you need to document clearly which fields comprise the business key, and write your application with the understanding that fields in the business key should never be changed for persisted objects. I can see why people ignore the advice and just use the primary key. But you could define the primary key like this:

CONSTRAINT pk_order_num_item PRIMARY KEY (order_num, order_item)

And you would still have the same problem.

Personally, I would like to see an annotation that specifies every field in the business key, and have an IDE inspection that checks if I modify it for persisted objects. Maybe that's asking too much.

Another approach, one that solves all of these problems, is to use a UUID for the primary key, which you generate on the client when you first construct an unpersisted entity. Since you never need to show it to the user, your code is not likely to change its value once you set it. This lets you write hashCode() and equals() methods that always work, and remain consistent with each other.

One more thing: If you want to avoid the problem of adding an object to a Set that already contains a different (modified) version of it, the only way is to always ask the set if it's already there before adding it. Then you can write code to handle that special case.

Orlando DFree
  • 66
  • 1
  • 3
1

What semantics does JPA/Hibernate impose?

The JPA specification says the following.

2.4 Primary Keys and Entity Identity

Every entity must have a primary key. ... The value of its primary key uniquely identifies an entity instance within a persistence context and to EntityManager operations

I interpret that as saying the semantics of equivalence for JPA entities is equivalence of primary keys. That suggests the equals() method should compare the primary keys for equivalence, and nothing else.

But the Hibernate advice you reference (and another article I've seen) say not to do that, but rather to use a "business key" rather than the primary key. The reason for this seems to be because we can not guarantee that an entity object has a value for a generated primary key until the entity has been synchronized (using EntityManager.flush()) to the data-base.

Community
  • 1
  • 1
Raedwald
  • 46,613
  • 43
  • 151
  • 237
  • 1
    The [spec](http://download.oracle.com/otndocs/jcp/ejb-3_0-fr-eval-oth-JSpec/) says "The primary key class must define `equals` and `hashCode` methods," but it doesn't say anything about the entity class, at least in that section -- it doesn't say that `EntityManager` or the "persistence context" depend on entity `equals()`. (In fact, I'd hope they wouldn't, and would depend on the `Id` annotations instead. But I don't know if that's the case, which is why I'm asking.) – David Moles Feb 01 '12 at 18:12
  • 2
    Agreed with @DavidMoles. The JPA spec is saying that Hibernate should use an entity's primary key to determine equality. In other words, Hibernate uses the ID (or composite ID if there is one). If there IS any scenario where Hibernate calls .equals() on the entity itself (and not the primary key field), then I'd like to know about it. – KyleM Oct 16 '14 at 18:45