12

When inserting multiple rows in a table, is there any guarantee that they go in in the order I specify? For instance, take the following:

DECLARE @blah TABLE
(
    ID INT IDENTITY(1, 1),
    Name VARCHAR(100) NOT NULL
);

INSERT INTO @blah (Name)
    VALUES('Timmy'),
    ('Jonny'),
    ('Sally');

SELECT * FROM @blah

Is there any guarantee that Sally will have a higher primary key than Timmy?

Pharylon
  • 9,796
  • 3
  • 35
  • 59
  • 2
    Since you are using identity why does it matter? You are effectively using a random (albeit predictable) number. I am pretty sure they will be inserted in the order specified but does it really matter? – Sean Lange Jan 05 '16 at 21:05
  • 3
    Yes, it matters. This is the simplest form of the question I could come up with, but in the "real world" scenario, I'm outputting the identities, and using them elsewhere. – Pharylon Jan 05 '16 at 21:09
  • 7
    You should never, ever use an identity column like that. If you need to guarantee that numbers follow some sort of predictable sequence then you should be explicit in generating that sequence. You're not looking for an identity column, you're looking for something that has some kind of business meaning in your database and you should define what that meaning is and provide for it. – Tom H Jan 05 '16 at 21:10
  • 1
    If you are going to use the assigned identity values somewhere else then you should use the OUTPUT clause. That way you are 100% certain which identity value goes with which row in the input values. – Sean Lange Jan 05 '16 at 21:12
  • @SeanLange: That's what I'm doing, but according to this article: https://msdn.microsoft.com/en-us/library/ms177564.aspx?f=255&MSPPError=-2147217396 "there is no guarantee that the order in which the changes are applied to the table and the order in which the rows are inserted into the output table or table variable will correspond." But if I know the order the objects were in the input statement (I do), I can rearrange the output table values later to match. – Pharylon Jan 05 '16 at 21:14
  • @TomH, I need the key because it's the foreign key to the next thing I insert. I broadly agree with you, but I'm working with what I got. – Pharylon Jan 05 '16 at 21:17
  • The point I was making is that the order really shouldn't matter but utilizing the OUTPUT clause you can access the values. If you really care about the order of identity values (which is a horrible mistake in my opinion) then you should do each insert individually. – Sean Lange Jan 05 '16 at 21:19
  • 3
    You seem to be relying on the values of your identity properties across multiple tables to stay in synch. This is VERY brittle. Let identity do its thing and utilize that value. Trying to estimate what it might be is bound to cause major problems. – Sean Lange Jan 05 '16 at 21:21
  • I'm not trying to estimate it. Here's a bit more full version of the code http://pastebin.com/N9DCJsN2 (I didn't include `OUTPUT` because I was tying to have the minimum simplest version of the question.) That works to get all the Identities, but according to the article I linked to earlier, I can't be guaranteed the output table is in the same order as the input statement. So I want to know if I can be guaranteed that they did get inserted in the order I specified. If so, I can reorder the output table later. – Pharylon Jan 05 '16 at 21:27
  • @TomH @SeanLange The order in which the `IDENTITY` values are generated is guaranteed if there is `ORDER BY` in the `INSERT`. See my answer for details. – Vladimir Baranov Jan 06 '16 at 03:08
  • Seems like it would be pretty simple to Name (or whatever identifying columns needed) to your table variable so you can output that too. Makes is pretty simple to then get the row values along with the newly created identity values. That way you know 100% that your identity values match up with the data. – Sean Lange Jan 06 '16 at 16:26

2 Answers2

15

The very similar question was asked before.

You can specify an ORDER BY in the INSERT.

If you do that, the order in which the IDENTITY values are generated is guaranteed to match the specified ORDER BY in the INSERT.

Using your example:

DECLARE @blah TABLE
(
    ID INT IDENTITY(1, 1) NOT NULL,
    Name VARCHAR(100) NOT NULL
);

INSERT INTO @blah (Name)
SELECT T.Name
FROM
    (
        VALUES
        ('Timmy'),
        ('Jonny'),
        ('Sally')
    ) AS T(Name)
ORDER BY T.Name;

SELECT
    T.ID
    ,T.Name
FROM @blah AS T
ORDER BY T.ID;

The result is:

+----+-------+
| ID | Name  |
+----+-------+
|  1 | Jonny |
|  2 | Sally |
|  3 | Timmy |
+----+-------+

That is, Name have been sorted and IDs have been generated according to this order. It is guaranteed that Jonny will have the lowest ID, Timmy will have the highest ID, Sally will have ID between them. There may be gaps between the generated ID values, but their relative order is guaranteed.

If you don't specify ORDER BY in INSERT, then resulting IDENTITY IDs can be generated in a different order.

Mind you, there is no guarantee for the actual physical order of rows in the table even with ORDER BY in INSERT, the only guarantee is the generated IDs.

In a question INSERT INTO as SELECT with ORDER BY Umachandar Jayachandran from MS said:

The only guarantee is that the identity values will be generated based on the ORDER BY clause. But there is no guarantee for the order of insertion of the rows into the table.

And he gave a link to Ordering guarantees in SQL Server, where Conor Cunningham from SQL Server Engine Team says:

  1. INSERT queries that use SELECT with ORDER BY to populate rows guarantees how identity values are computed but not the order in which the rows are inserted

There is a link to MS knowledge base article in the comments in that post: The behavior of the IDENTITY function when used with SELECT INTO or INSERT .. SELECT queries that contain an ORDER BY clause, which explains it in more details. It says:

If you want the IDENTITY values to be assigned in a sequential fashion that follows the ordering in the ORDER BY clause, create a table that contains a column with the IDENTITY property and then run an INSERT ... SELECT ... ORDER BY query to populate this table.

I would consider this KB article as an official documentation and consider this behaviour guaranteed.

David Browne - Microsoft
  • 80,331
  • 6
  • 39
  • 67
Vladimir Baranov
  • 31,799
  • 5
  • 53
  • 90
  • Is that a guarantee or current behavior? I would not treat Umachandar Jayachandran's statement as official product documentation here. – usr Jan 06 '16 at 11:37
  • In any case the identity values might have gaps so that it's impossible to recover them. Not sure if he needs that. Probably. – usr Jan 06 '16 at 11:37
  • @usr, the [Ordering guarantees in SQL Server](http://blogs.msdn.com/b/sqltips/archive/2005/07/20/441053.aspx) post has a link to MS knowledge base article [The behavior of the IDENTITY function when used with SELECT INTO or INSERT .. SELECT queries that contain an ORDER BY clause](https://support.microsoft.com/en-us/kb/273586), which explains it in more details. So, yes, I would consider it as an official documentation and consider this behaviour guaranteed. – Vladimir Baranov Jan 06 '16 at 11:46
  • 1
    @usr, Yes, the generated identity values may have gaps, but the question was "Is there any guarantee that Sally will have a higher primary key than Timmy?" The answer is: "yes, if you use appropriate `ORDER BY` in the `INSERT`", while you answered: "no". – Vladimir Baranov Jan 06 '16 at 11:47
  • That is true. I stay skeptical of this guarantee, though. Also he says in the comments: "but in the "real world" scenario, I'm outputting the identities, and using them elsewhere". – usr Jan 06 '16 at 12:13
  • @usr, using `OUTPUT` is a totally different question. If OP needs a mapping between rows in a source and destination tables, he should use `MERGE` with `OUTPUT` instead of simple `INSERT`. `MERGE` allows to include in the `OUTPUT` clause columns from both source and destination tables. `OUTPUT` for `INSERT` allows only columns from destination table. – Vladimir Baranov Jan 06 '16 at 12:26
  • Your guaranteed result is not guaranteed. The ORDER BY in your insert might guarantee the ordering of the identity values, but it does not guarantee that they will be sequential. Inside Microsoft SQL Server 2008 talks about this specifically. – Tom H Jan 06 '16 at 13:33
  • @TomH, you are right. I tried to rephrase the description of the result of the sample query to make it clear. The question never mentions whether IDs are sequential or not and I didn't highlight that point in the answer. – Vladimir Baranov Jan 06 '16 at 14:21
  • 1
    @TomH, I didn't ask if they would be sequential. I asked if the order would be guaranteed. ("Is there any guarantee that Sally will have a higher primary key than Timmy?") I phrased it that way because I knew there would be no guarantee that there would be no gaps. – Pharylon Jan 07 '16 at 13:36
0

Your best two options are:

Process the rows individually - insert the parent row, get the ID, then insert the children, then insert the next parent row, etc.

Alternatively, assuming that the data has an actual identifier in the real world (for example, email address, SSN, etc.) then you can use that to join back to your parent table when inserting into the child table.

Tom H
  • 46,766
  • 14
  • 87
  • 128