19

My understanding of Hibernate is that as objects are loaded from the DB they are added to the Session. At various points, depending on your configuration, the session is flushed. At this point, modified objects are written to the database.

How does Hibernate decide which objects are 'dirty' and need to be written?

Do the proxies generated by Hibernate intercept assignments to fields, and add the object to a dirty list in the Session?

Or does Hibernate look at each object in the Session and compare it with the objects original state?

Or something completely different?

Vlad Mihalcea
  • 142,745
  • 71
  • 566
  • 911
tgdavies
  • 10,307
  • 4
  • 35
  • 40
  • It's great that you're here with us and you're building your reputation score and we don't write programs from scratch here. Do reacher before posting any questions and please read https://stackoverflow.com/help/how-to-ask –  Apr 19 '21 at 16:42
  • What does 'reacher' mean in this context, @nirazv? – tgdavies Apr 19 '21 at 17:03

5 Answers5

24

Hibernate does/can use bytecode generation (CGLIB) so that it knows a field is dirty as soon as you call the setter (or even assign to the field afaict).

This immediately marks that field/object as dirty, but doesn't reduce the number of objects that need to be dirty-checked during flush. All it does is impact the implementation of org.hibernate.engine.EntityEntry.requiresDirtyCheck(). It still does a field-by-field comparison to check for dirtiness.

I say the above based on a recent trawl through the source code (3.2.6GA), with whatever credibility that adds. Points of interest are:

  • SessionImpl.flush() triggers an onFlush() event.
  • SessionImpl.list() calls autoFlushIfRequired() which triggers an onAutoFlush() event. (on the tables-of-interest). That is, queries can invoke a flush. Interestingly, no flush occurs if there is no transaction.
  • Both those events eventually end up in AbstractFlushingEventListener.flushEverythingToExecutions(), which ends up (amongst other interesting locations) at flushEntities().
  • That loops over every entity in the session (source.getPersistenceContext().getEntityEntries()) calling DefaultFlushEntityEventListener.onFlushEntity().
  • You eventually end up at dirtyCheck(). That method does make some optimizations wrt to CGLIB dirty flags, but we've still ended up looping over every entity.
Bhesh Gurung
  • 50,430
  • 22
  • 93
  • 142
Matt Quail
  • 6,189
  • 2
  • 23
  • 20
  • Please, explain better: if the dirty flag doesn't reduce the number of objects that need to be dirty-checked what is its usefulness? Furtermore in the source I see two different dirty flags: one is in AbstractFieldInterceptor and is checked by EntityEntry.requiresDirtyCheck(), another one is in AbstractPersistentCollection and is commented "collections detect changes made via their public interface and mark themselves as dirty as a performance optimization" (I don't know where it's checked in the code). – Pino Sep 14 '12 at 14:38
  • @Matt Quail, I couldn't find dirty flags you're talking about, there is only the one that can be used only if the buildtime bytecode instrumentation is used. – Stanislav Bashkyrtsev Sep 28 '14 at 05:57
5

Hibernate takes a snapshot of the state of each object that gets loaded into the Session. On flush, each object in the Session is compared with its corresponding snapshot to determine which ones are dirty. SQL statements are issued as required, and the snapshots are updated to reflect the state of the (now clean) Session objects.

alasdairg
  • 2,108
  • 12
  • 14
  • In Hibernate the situation is how you describe: an attribute by attribute comparison of snapshot objects with the session objects. This is in contrast to Datanucleus (in either JDO or JPA mode) where its byte code enhancement enables it to do very intelligent, highly performant things like adding a dirty flag to each persistent class. What this means is that during a flush in Datanucleus the dirtiness of any object is a simple check of the dirty flag. You don't need to be Einstein to work out why, with any non trivial object oriented domain model, why Hibernate's flush takes so much longer. – Volksman Mar 14 '14 at 23:24
1

Hibernate default dirty checking mechanism will traverse current attached entities and match all properties against their initial loading-time values.

You can better visualize this process in the following diagram:

Default automatic dirty checking

Vlad Mihalcea
  • 142,745
  • 71
  • 566
  • 911
1

Take a look to org.hibernate.event.def.DefaultFlushEntityEventListener.dirtyCheck Every element in the session goes to this method to determine if it is dirty or not by comparing with an untouched version (one from the cache or one from the database).

Jeroen Wyseur
  • 3,413
  • 3
  • 19
  • 16
0

These answers are incomplete (at best -- I am not an expert here). If you have an hib man entity in your session, you do NOTHING to it, you can still get an update issued when you call save() on it. when? when another session updates that object between your load() and save(). here is my example of this: hibernate sets dirty flag (and issues update) even though client did not change value

Community
  • 1
  • 1
tom
  • 2,190
  • 1
  • 23
  • 27