2

I'm doing some data migration after some data model refactoring and I'm taking a couple tables with composite primary keys and combining them into a larger table and giving it its own unique primary key. At this point, I've written some SQL to copy the old table data into a new table and assign a primary key using AUTO_INCREMENT. After the migration is done, I remove the AUTO_INCREMENT from the PK field. So, now that's all gravy, but the problem is that I need the hibernate sequence to know what the next available PK will be. We use the TABLE strategy generally for all of our entities and I'd like to stay consistent and avoid using AUTO_INCREMENT and the IDENTITY strategy for future objects. I've gotten away with temporarily setting the respective row in the generated "hibernate_sequences" table to the max id of the newly created table, but this is just a bandaid fix to the problem. Also, this results in the next IDs created to be much larger than the max id. I'm certain this is because I don't understand the HiLo id-assigning mechanism, which is why I'm posting here. Is there a way to set this up so that the Ids will be sequential? Or, where is the code that generates the HiLo value so that I can calculate what it should be to ensure sequential ids?

lakemalcom
  • 1,016
  • 1
  • 8
  • 17

2 Answers2

2

If I understood you correctly, the problem is that hibernate doesn't generate sequental IDs for you. But that is how hi/lo generator works and I do not understand exactly why you do not like it.

In basic, Hi/lo generator is based on supporting HIGH and LOW values separately. When LOW reaches its limit it is reset and HIGH is incremented. The result key is based on combining HIGH and LOW values together. E.g. assume key is double word and HIGH and LOW are words. HIGH can be left two bytes and LOW right two bytes.

Jumps in ID depend on two factors - the max value for LOW and on event which triggers changing the value of HIGH. By default in Hibernate, max value for LOW is Short.MAX_VALUE and is reset on each generator initialization. HIGH value is read from the table and incremented on each initialization, also it is incremented when LOW reaches it's upper limit. All this means that on each restart of your application, you will have gaps in IDs.

Looking at the code, it seems, that if you would use value <1 for max_lo, the key will be generated just by incrementing hi value, which is read from DB. You would probably like that behaviour :)

Have a look at the source code for org.hibernate.id.MultipleHiLoPerTableGenerator#generate

Stas
  • 1,707
  • 15
  • 25
  • 1
    To have sequential PKs, you have to hit the DB every time you want to do an insert. HiLo algorithm is designed such that when you query the DB for PKs, instead of getting one PK you reserve a whole "block" of PKs for use. The downside to this is that if a reserved block of PKs is not used, you will have "holes" in the PKs. You have mentioned that the other tables in the DB use HiLo without any issues. Is insertion order a piece of data you want to capture in the DB, which you are deriving right now from the order of PKs? – Binil Thomas Jan 25 '11 at 00:10
  • Please see http://stackoverflow.com/questions/282099/whats-the-hi-lo-algorithm/282113#282113 also. – Binil Thomas Jan 25 '11 at 00:11
  • @binil Thanks for mentionung about db access in case of sequental id generation! Thats important bit which i forgot about. – Stas Jan 25 '11 at 15:06
  • Thanks for all the very informative comments. I haven't settled on the best strategy for how I want to proceed but this certainly answers my questions. – lakemalcom Jan 27 '11 at 18:23
  • 1
    Actually, I realized later that this doesn't answer my question. What I'm asking is that once my table is prepopulated in MySQL, how do I get hibernate to use the next available number in the sequence? It doesn't have to be the next sequential number, but I need a reliable method of producing *definitely* available IDs. – lakemalcom Feb 02 '11 at 23:00
0

Using org.hibernate.id.MultipleHiLoPerTableGenerator#generate, I figured out that my batches were of size 50, and so for my purposes using the max id / 50 + 1 generated a usable number to throw into the sequence to make them as close to sequential as possible.

SteeveDroz
  • 6,006
  • 6
  • 33
  • 65
lakemalcom
  • 1,016
  • 1
  • 8
  • 17