How can the date a row was added be in a different order to the identity field on the table?

Question

I have a 'change history' table in my SQL Server DB called tblReportDataQueue that records changes to rows in other source tables.

There are triggers on the source tables in the DB which fire after INSERT, UPDATE or DELETE. The triggers all call a stored procedure that just inserts data into the change history table that has an identity column:

INSERT INTO tblReportDataQueue
    (
        [SourceObjectTypeID],
        [ActionID],
        [ObjectXML],
        [DateAdded],
        [RowsInXML]
    )
VALUES
    (
        @SourceObjectTypeID,
        @ActionID,
        @ObjectXML,
        GetDate(),
        @RowsInXML
    )

When a row in a source table is updated multiple times in quick succession the triggers fire in the correct order and put the changed data in the change history table in the order that it was changed. The problem is that I had assumed that the DateAdded field would always be in the same order as the identity field but somehow it is not.

So my table is in the order that things actually happened when sorted by the identity field but not when sorted by the 'DateAdded' field.

How can this happen?

screenshot of example problem

In example image 'DateAdded' of last row shown is earlier than first row shown.

Identity reseed? Daylight saving time adjustments? Server time adjustments? — Sean Lange, Feb 12 '15 at 18:24
Just a guess, but if the insert is being blocked, does it still get the date from the time when it started to do the insert? Is the time difference big? — James Z, Feb 12 '15 at 18:25
Thanks for the replies. No identity reseed or time adjustments. The time difference is a few hundreds of a second — Pete, Feb 12 '15 at 18:30
Can it pull the date from the inserted data rather than on the fly to prevent multi-threading issues? — Kevin Cook, Feb 12 '15 at 18:43
Tested blocking, in that case the inserted record gets the time after the blocking ends. Can't really test if it's possible that the thread gets thrown out of cpu between getting the time and doing the insert -- and if another thread can do its insert during that time. — James Z, Feb 12 '15 at 19:29

score 1 · Answer 1 · answered Feb 12 '15 at 20:27

1

You are using a surrogate key. One very important characteristic of a surrogate key is that it cannot be used to determine anything about the tuple it represents, not even the order of creation. All systems which have auto generated values like this, including Oracles sequences, make no guarantee as to order, only that the next value generated will be unique from previous generated values. That is all that is required, really.

We all do it, of course. We look at a row with ID of 2 and assume it was inserted after the row with ID of 1 and before the row with ID of 3. That is a bad habit we should all work to break because the assumption could well be wrong.

You have the DateAdded field to provide the information you want. Order by that field and you will get the rows in order of insertion (if that field is not updateable, that is). The auto generated values will tend to follow that ordering, but absolutely do not rely on that!

answered Feb 12 '15 at 20:27

TommCatt

5,498
1
13
20

Thanks @TommCatt, I'm sure that's all true but I think in my case the IDs are correct. The image in the link shows a screenshot of an instance of my issue [link](http://gyazo.com/2e346bdb70a354e5da22de676039119e). The first row with ID 35006711 is the insert of a row then down to 35006717 are subsequent updates. I know they're the correct order by looking at the XML of what was changed (in ObjectXML) but the 'DateAdded' of row 35006717 is earlier than the insert for the row it was recording an update for (ID 35006711). I still don't understand how this can be? – Pete Feb 13 '15 at 10:50
I don't know the particular system you're using but it may have split the inserts into chunks and assigned the chunks to different threads. Each thread would be performing "in parallel". What that really means, of course, is "one at a time." Each task could have been assigned a small cache of consecutive values. So a task that is active and inserting may be using a series of values higher than a waiting task. When the waiting task resumes, it inserts smaller values but at a later time. The results can be aesthetically displeasing, but there is nothing wrong. – TommCatt Feb 13 '15 at 16:08

score 0 · Answer 2 · edited May 23 '17 at 12:12

try use Sequence...

     "Using the identity attribute for a column, you can easily generate auto-

    incrementing numbers (which as often used as a primary key). With Sequence, it 

    will be a different object which you can attach to a table column while 

    inserting. Unlike identity, the next number for the column value will be 

    retrieved from memory rather than from the disk – this makes Sequence 

    significantly faster than Identity.

Unlike identity column values, which are generated when rows are inserted, an

 application can obtain the next sequence number before inserting the row by 

calling the NEXT VALUE FOR function. The sequence number is allocated when NEXT 

VALUE FOR is called even if the number is never inserted into a table. The NEXT

 VALUE FOR function can be used as the default value for a column in a table 

definition. Use sp_sequence_get_range to get a range of multiple sequence 

numbers at once."

How can the date a row was added be in a different order to the identity field on the table?

2 Answers2