12

If I have a table with columns A, B, C, D
 A: auto-generated id (PK)
 B & C: combination must be unique (these are the columns that actually define identity in the business sense)
 D: some other columns

Now, if I'll create business objects based on this table (e.g. in Java), which one would be a better implementation of the equals() method:

  1. define equality based on A
  2. define equality based on B and C

or, it wouldn't really matter which of the two I choose.

Buhake Sindi
  • 87,898
  • 29
  • 167
  • 228
willy wonka
  • 163
  • 1
  • 7

3 Answers3

21

Definitely B and C, because you want the equals() contract to be valid even before entities are persisted. You say yourself:

these are the columns that actually define identity in the business sense

If that is the case, then that is the logic equals() should use. Database keys are the database's concern and should be of no concern to your business layer.

And don't forget to use the same properties in hashcode(), too.

Sean Patrick Floyd
  • 292,901
  • 67
  • 465
  • 588
  • why exclude `A` in the `equals()` method? – Buhake Sindi Dec 01 '10 at 10:22
  • 3
    Because from a business perspective, a persisted entity and a non-persisted entity with equal properties B and C are equal. The business perspective should not care about implementation details like DB id, that's relevant for the model only. – Sean Patrick Floyd Dec 01 '10 at 10:25
  • thanks. i think i'm thinking backwards (putting implementation first; business sense second). your answer made me realize that. thanks. – willy wonka Dec 01 '10 at 10:38
3

I agree with @S.P.Floyd as well. But I wanted to add something more.

There are situations when an entity doesn't have unique business properties. For instance, an entity may only have A (the PK) and B (a business property), but many entities have the same B value.

In this case, it is difficult to create an equals() and hashcode(). You certainly do not want to base them on A, as you won't be able to compare a persisted object with one that hasn't been persisted yet. And you can't base it on B alone, because then many objects that are different unique entities would appear to be the same.

What I do in these situations is have a Date created = new Date(); property. When an entity is created, it automatically gets a created timestamp. In my equals() and hashcode() I include both B and created. This isn't perfect, as there is a very slim chance that two objects could be created at the same time (especially in a clustered solution), but it's a start. If you must, add a UID or other generated business property that isn't the database's PK.

Tauren
  • 26,795
  • 42
  • 131
  • 167
  • Does the created field map to the DB or is it just an artificial (transient) field? I'd rather not like to pollute my DB tables just because I'm using a specific technology, here JPA. – Kawu Jul 08 '11 at 20:49
  • 1
    @Kawu - in my case, the `created` field does map to the database, but I need that information anyway. It would be much harder to make a transient field work as it would be more likely for dates to collide when pulling a collection of objects from the DB. I consider the created date to be when the DB entity was originally created, not the date the object was pulled from your database and turned into a pojo again. – Tauren Jul 10 '11 at 09:54
2

If (B,C) is a unique pair, there's no need for an auto-generated id in addition. For the table, A is equivalent to (B,C) (one-to-one relation).

You may want or need the extra key, but I agree with seanizer, use (B,C) for equals and because A is redundant (and null before the object is persisted), don't use that for equals (and hashcode)

Andreas Dolk
  • 113,398
  • 19
  • 180
  • 268