4

If I want to union data from multiple tables located on different drives, will SQL pull the data in parallel? Are there any related setting or hints I should know about?

Brandon Moore
  • 8,590
  • 15
  • 65
  • 120

3 Answers3

8

The UNION should run in parallel, at least since SQL Server 2005.

It doesn't make a difference if the tables are located on different drives or the same drive. In the modern world, disk can be virtual, or have multiple read heads. The distinction between one drive and more than one drive is less and less relevant.

If you have MAXDOP set to 1, then there will only be one thread.

Do note that UNION is going to be much slower than UNION ALL.

Brandon . . . let me respond here. You seem to be thinking in terms of older style architectures. These definitely still exist. However, modern disks have multiple read heads and multiple platters. Often, the issue with returning data involves the bandwidth at the controller level, and not the speed of the read. You also have multiple levels of caching and read-ahead (sometimes at both the file system and database levels). You are often better off letting the data base engines manage this complexity.

For instance, the machine that I'm working on right now is really a virtual machine. The disk I use is a partition on an EMC box. The processors are some set of processors in a big box.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • Hmm... it seems to me that the disk reads would be the longest part, and if the data is all on the same disk then I would think it would make more sense to read in one table and then another and not go back and forth between the two. However if the tables are on separate drives then it would be faster to read in both at the same time. So I don't really understand how it is that you say that doesn't make a difference? What am I missing here? – Brandon Moore Jul 20 '12 at 03:20
2

My understanding of multi-threading in SQL Server is that we should leave it to the query optimiser - queries will be run in parallel when optimal.

You can limit the number of threads by using the MAXDOP hint (see What is the purpose for using OPTION(MAXDOP 1) in SQL Server?).

The default behaviour is to run in parallel when possible and optimal.

Community
  • 1
  • 1
Kirk Broadhurst
  • 27,836
  • 16
  • 104
  • 169
1

I wouldn't count on data being returned in a specific order solely by the order of your union'ed queries.

For me, when I have to do something like that I always wrap that entire query as a sub select only to handle sorting. like the following

Select pk_id, value from ( select pk_id, value from table1 union select pk_id, value from table2 ) order by PK_id, value

That way your never surprised by what you get back.

whiskeyfur
  • 736
  • 1
  • 5
  • 14