35

I'm struggling to get my head round how the HiLo generator works in NHibernate. I've read the explanation here which made things a little clearer.

My understanding is that each SessionFactory retrieves the high value from the database. This improves performance because we have access to IDs without hitting the database.

The explanation from the above link also states:

For instance, supposing you have a "high" sequence with a current value of 35, and the "low" number is in the range 0-1023. Then the client can increment the sequence to 36 (for other clients to be able to generate keys while it's using 35) and know that keys 35/0, 35/1, 35/2, 35/3... 35/1023 are all available.

How does this work in a web application as don't I only have one SessionFactory and therefore one hi value. Does this mean that in a disconnected application you can end up with duplicate (low) ids in your entity table?

In my tests I used these settings:

<id name="Id" unsaved-value="0">
  <generator class="hilo"/>
</id>

I ran a test to save 100 objects. The IDs in my table went from 32768 - 32868. The next hi value was incremented to 2. Then I ran my test again and the Ids were in the range 65536 - 65636.

First off, why start at 32768 and not 1, and secondly why the jump from 32868 to 65536?

Now I know that my surrogate keys shouldn't have any meaning but we do use them in our application. Why can't I just have them increment nicely like a SQL Server identity field would.

Finally can someone give me an explanation of how the max_lo parameter works? Is this the maximum number of low values (entity ids in my head) that can be created against the high value?

This is one topic in NHibernate that I have struggled to find documentation for. I read the entire NHibernate in action book and it still doesn't go into how this works in any detail.

Thanks Ben

Community
  • 1
  • 1
Ben Foster
  • 34,340
  • 40
  • 176
  • 285

4 Answers4

23

I believe your understanding is more or less correct. The max_lo parameter is simply used to determine the number of Ids available for any given Hi value.

My best guess is that NHibernate's default max_lo value is 32768. Thus a Hi value of 1 would start your Ids at 32768 and run you right up to 65535. A Hi value of 2 would start at 65536 and run up another max_lo Ids.

Basically you use the max_lo value to control Id fragmentation. 32768 is likely not the optimal value for every situation.

It is important to note however that this only works within the scope of a SessionFactory. If you are stopping/starting your application and reinitializing the SessionFactory a whole bunch, it's going to increment the Hi value upon startup anyway and you're going to see your Ids jump pretty quickly.

Chris Stavropoulos
  • 1,766
  • 13
  • 27
9

Looking at the keys generated by my Nhibernate 3 HiLo objects, the algorithm looks like: (Hi * Lo) + Hi

So with my Hivalue in the DB as 390 and with my configuration as follows:

<id name="TimeclockId" column="TimeclockId" type="Int64" unsaved-value="0">
      <generator class="hilo">
        <param name="where">TableId = 1</param>
        <param name="table">HiValue</param>
        <param name="column">NextValue</param>
        <param name="max_lo">10</param>
      </generator>
    </id>

I restart my app pool and get (390 * 10) + 390 = 4290, the range being 4290 - 4300.

This is the reason why you get seemingly strange gaps in your primary keys because the next generated key from a hi value of 391 is 4301, and the range is 4301 - 4311.

gt124
  • 1,238
  • 13
  • 23
  • 1
    this is a nice and clean explanation. though they actually shape the formula as (max_lo + 1) * hi . i think its cleaner and points that your id domain is divided in pieces as big as the max_lo+1 value, not max_lo . – kommradHomer Mar 08 '12 at 13:42
4

For those wondering how to choose a good max_lo value, the trade-off is essentially between:

  • Frequency with which you need to query a new hi value from the db.
  • Maximum amount of unique numbers you actually can generate.

A lower max_lo will make sure there is no "waste" of id's, which in turn governs the moment at which you will hit the implicit limit of your datatype (which will likely be int). The price you pay is that each client needs to query and increase the hi value more frequently.

A higher max_lo is useful to reduce the frequency of queries that get and increment hi, but result in more waste.

The metrics you need to take into account to determine the optimal value are:

  • Frequency at which new entities are created and need an ID
  • Frequency at which the application restarts / gets recycled (anything that results in a new NHibernate SessionFactory)

Let's consider a web application that is hosted in IIS and is recycled every 24 hours. The entities are Customer and Order.

Now lets assume:

  • 10000 new Orders per 24 hours
  • 10 new customers per 24 hours

Then the perfect max_lo is 10000 for Orders and 10 for Customers. Of course, in the real world you can never determine it so precicely and clearly, but you should get the idea here!

Now let's consider different scenario where we pick totally wrong (ridiculous) max_lo's:

  • Suppose a 10 customers are making orders simultaneously every second, with a max_lo of only 10 on orders, every second there is a superfluous database call to increment hi.
  • Suppose your app is a desktop app and is installed on 50 clients (support staff?), that each start it about twice a day. Together they create about 100 helpdesk tickets a day. Now let's say we stick with the max_lo default of 32767. Hi is incremented 100 times a day (50 clients * 2), which means you will hit the maximum value of int in less than 2 years, should you have forgotten the important fact that hi gets incremented so frequently. A good max_lo here would be (100 tickets / 50 clients) = only 2.

Hopes this helps with conceptualizing the HiLo algorithm and its implications in general, while also giving you the maths to actually stick a number on max_lo.

MarioDS
  • 12,895
  • 15
  • 65
  • 121
2

NHibernate 3.1.1 does this to generate ID using HiLo

if (lo > maxLo)
{
    long hival = <GetNextHiFromDB>
    lo = hival == 0 ? 1 : 0;
    hi = hival * (this.maxLo + 1L);
}
long result = hi + lo;
lo++;
return result;

Inside NHibernate configuration you specify maxLo. If maxLo is set to 100 you will get 101 ids for each hi value.

samfromlv
  • 1,001
  • 1
  • 10
  • 17