2

I'm working on large and gnarly sql script that is pushing 500 lines. At the moment it is organized roughly like this.

-- Declare and set a couple of dozen variables that will be used in the script
-- Do lots of deletes, inserts, and updates

At the moment it is all done in a single batch. The reason is that many of the operations rely on common variables, and variables don't cross GO boundaries. Is there benefit to breaking a script like this up into smaller batches, even if it means redundantly re-declaring and re-setting some of the variables in each batch?

-- Declare a few variables
-- Run the delete, update, and insert operations that rely on those variables
-- GO

-- Declare a few variables, some new and some the same as from the previous batch
-- Run the delete, update, and insert operations...
-- GO

-- Rinse and repeat about a dozen times

Getting the values for the variables is inexpensive, usually setting them to literals or the result of selecting from tables with only 10s of rows.

Each of the deletes, updates, and inserts operate on sets of roughly 1 million to 5 million rows.

Is there a theoretical memory, storage (for the log files), and/or performance improvement that would be gained by breaking this up into multiple batches and that would outweigh the ugliness and annoyance of re-declaring and re-setting some variables multiple times?

What resources are out there to learn more about batches in this kind of situation? I know there are cases where batches are required, such as creating or updating tables before operating on the new tables/columns. The answers to this question indicate that there may be some benefit for the size of the log file when using smaller batches. What I haven't been able to find is a definitive source of information on possible scenarios and performance benefits for using smaller batches.

Thanks

Community
  • 1
  • 1
Mike Nitchie
  • 1,166
  • 2
  • 13
  • 29

2 Answers2

1

Yes, you should. Split it into logical blocks.

For example:

 exec base_data
     can call exec base_data_address
     can call exec base_data_name
     can call exec base_data_date
  • Easy to read.
  • Easy to maintenance and debug.
  • Easy to reuse procedure with parameters.
  • Easy to control transaction flow.
  • Easy to error handling.
  • Easy to add new block.

If you have to send a lot of variables it seems something is wrong. For shortening code use views, functions.

Deadsheep39
  • 561
  • 3
  • 16
0

Readability:- Absolutely, It is far better to break them apart into several small easy to understand queries and helps to maintain.

Performance:- It varies, you have to look at the execution plan to be sure of this. SQL Server parallelism already does a very good job breaking down queries into multi-threaded searches but you are at least giving SQL a better chance to come up with a good query plan.

The one thing I have always gained from breaking down queries is that I do not get transaction log error if query is humongous and works well when broken down.

Nirjhar Vermani
  • 1,215
  • 8
  • 17