Under the current implementation of .nodes
, the XML nodes are generated in document order. The result of that is always joined to the original data using a nested loops, which always runs in order also.
Furthermore, inserts are generally serial (except under very specific circumstances that it goes parallel, usually when you have an empty table, and never with an IDENTITY
value being generated).
Therefore there is no reason why the server would ever return rows in a different order than the document order. You can see from this fiddle that that is what happens.
That being said, there is no guarantee that the implementation of .nodes
won't change, or that inserts may in future go parallel, as neither of these is documented anywhere as being guaranteed. So I wouldn't rely on it without an explicit ORDER BY
, and you do not have a column to order it on.
Using an ORDER BY
would guarantee it. The docs state: "INSERT
queries that use SELECT
with ORDER BY
to populate rows guarantees how identity values are computed but not the order in which the rows are inserted."
Even using ROW_NUMBER
as some have recommended is also not guaranteed. The only real solution is to get the document order directly from XQuery.
The problem is that SQL Server's version of XQuery does not allow using position(.)
as a result, only as a predicate. Instead, you can use a hack involving the <<
positional operator.
For example:
SELECT T.X.value('text()[1]', 'nvarchar(100)') as RowLabel,
T.X.value('let $i := . return count(../*[. << $i]) + 1', 'int') as RowNumber
FROM src
CROSS APPLY xmlcolumn.nodes('blah') as T(X);
What this does is:
- Assign the current node
.
to the variable $i
- Takes all the nodes in
../*
i.e. all children of the parent of this node
- ...
[. << $i]
that are previous to $i
- and counts them
- Then add 1 to make it one-based