4

I had a typical non-normalized table (tempTable) with multiple numbered columns (rep1,rep2,...). So i wrote a script to insert the non-normalized data into a normalized table (myTable):

insert into myTable
select idRep,rep FROM
(
    select idRep, ISNULL(rep1,'') as rep FROM tempTable
    union
    select idRep, ISNULL(rep2,'') as rep FROM tempTable
    union
    select idRep, ISNULL(rep3,'') as rep FROM tempTable
    union
    select idRep, ISNULL(rep4,'') as rep FROM tempTable
    union
    select idRep, ISNULL(rep5,'') as rep FROM tempTable
) as t

Note: The table myTable also contains an auto-incremented IDENTITY column as its PRIMARY KEY.

The order rep1, rep2, rep3, rep4, rep5 is important in my scenario. Strangely, when I executed the script, the data wasn't inserted in the correct order such as the auto-generated id '1000' had the value from 'rep3' and the id '1001' had the value from 'rep1'.

Why is that? How was the script executed?

Francis P
  • 13,377
  • 3
  • 27
  • 51

4 Answers4

9

The reason it is not going in the order you expect when using UNION is that union attempts to impose uniquness, so it is processing all of those rows together and bringing them out in the order most convenient for the engine.

If you switch to UNION ALL (which does not try to impose uniqueness) as Parado suggested it will not do the processing and they will go into the table in the order you put them in, almost all the time. This however is not gaurunteed and certain very unusual circumstances going on in other processes (especially ones that somehow touch on your tempTable) can affect it.

If you use an order by as Kash suggests then that will gauruntee the order of the ids (which can matter), but not technically the order that the rows get inserted (which very rarely matters in practice).

There is a good summary of some of this on MSDN.

So, that takes care of the why. As for the how to get what you actually want, I would use Kash's suggestion of adding a column to use with an order by clause, but I would use UNION ALL instead of UNION. Using UNION is like adding and implicit "distinct" requirement, which takes up processor cycles and makes the query plan more complicated.

TimothyAWiseman
  • 14,385
  • 12
  • 40
  • 47
4

Your outer Select has no order, hence the INSERT is not ordered like it seems.

There are a few Ordering Guarantees in SQL Server and an INSERT of SELECT with ORDER BY guarantees computation of identity values as quoted:

INSERT queries that use SELECT with ORDER BY to populate rows guarantees how identity values are computed but not the order in which the rows are inserted

Change your SQL to make it ordered:

insert into myTable
select idRep,rep FROM
(
    select idRep, ISNULL(rep1,'') as rep, 1 as Grp FROM tempTable
    union
    select idRep, ISNULL(rep2,'') as rep, 2 as Grp FROM tempTable
    union
    select idRep, ISNULL(rep3,'') as rep, 3 as Grp FROM tempTable
    union
    select idRep, ISNULL(rep4,'') as rep, 4 as Grp FROM tempTable
    union
    select idRep, ISNULL(rep5,'') as rep, 5 as Grp FROM tempTable
) as t ORDER BY Grp
Kash
  • 8,799
  • 4
  • 29
  • 48
  • I don't think this will guarantee the order. – Kermit Sep 10 '12 at 21:35
  • 1
    You should also use `UNION ALL`. – Kermit Sep 10 '12 at 21:37
  • @njk So SELECT with ORDER BY doesn't guarantee the ordered insertion of those values? Why is that? – Francis P Sep 10 '12 at 21:38
  • `UNION` has been specifically used by OP to weed out duplicates. Why would you use `UNION ALL`? – Kash Sep 10 '12 at 21:39
  • @Kash The OP does not mention removal of duplicates. The requirement that they are moving from non-normalized to normalized would hint that removal of duplicates would be a loss of fidelity. – Kermit Sep 10 '12 at 21:41
  • @Kash see the link in my comment to the question – paparazzo Sep 10 '12 at 21:41
  • @FrancisP An `ORDER BY` on the `Grp` as Kash suggested would only enforce the `Grp`, not the order of the `UNION`ed queries. The `ORDER BY` should be added at the end of each `SELECT`. – Kermit Sep 10 '12 at 21:43
  • @njk: OP needs only the groups to be ordered `The order rep1, rep2, rep3, rep4, rep5 is important in my scenario.`, not the values inside those groups. – Kash Sep 10 '12 at 21:45
  • @Kash You post yourself that the `ORDER BY` needs to be added to the `SELECT`, but you do not add it at the lower level. – Kermit Sep 10 '12 at 21:45
  • @njk: OP has not mentioned any ordering for values having the same rep type (rep1, rep2,... ). Hence you do not need an `ORDER BY` in the lower level. – Kash Sep 10 '12 at 21:46
  • @Blam: Same link in my answer. That just fortifies my answer. – Kash Sep 10 '12 at 21:47
  • @Kash Hopefully TimothyAWiseman's answer clears up the confusion. – Kermit Sep 10 '12 at 21:51
2

Try with union all. It does't sort the data:

insert into myTable
select idRep,rep FROM
(
    select idRep, ISNULL(rep1,'') as rep FROM tempTable
    union all
    select idRep, ISNULL(rep2,'') as rep FROM tempTable
    union all
    select idRep, ISNULL(rep3,'') as rep FROM tempTable
    union all
    select idRep, ISNULL(rep4,'') as rep FROM tempTable
    union all
    select idRep, ISNULL(rep5,'') as rep FROM tempTable
) as t
Robert
  • 25,425
  • 8
  • 67
  • 81
0

Here is another way to do this without unions altogether. It references the source table only once. Sorting is simple.

declare @numbers table (number int)
insert into @numbers (number)
values (1),(2),(3),(4),(5)

insert into mytable
select 
case number
    when 1 then rep1
    when 2 then rep2
    when 3 then rep3
    when 4 then rep4
    when 5 then rep5
    end as rep
from temptable cross join @numbers n
order by number asc, id asc
bigchief
  • 101