0

Here is the question: I am using VisualCron to run a ssis package on SQL Server 2008 R2. The SSIS package will run a query which get millions of rows and output it into a flat file. Sometimes, I found when I run this SSIS package, the sql server doesn't use multi-threading(I can tell that from the activity monitor) , this lead to very long running time about 20 hours. But, if it was using multi-threading it could be done in 8 minutes.

Is there a way to force sql server to use multi-threading whenever it is running this SSIS package?

luk2302
  • 55,258
  • 23
  • 97
  • 137
Jin Tiger
  • 125
  • 1
  • 7

2 Answers2

0

There are a few options for optimizing your query to handle multiple simultaneous operations... or at least improving performance.

  1. Apply OPTION MAXDOP in your query to apply the maximum number of processors (parallelism) available with the operating system. Listed below is an example and here is a link with more detail.

    SELECT FirstName, LastName FROM dbo.Customer OPTION (MAXDOP 1)

  2. Apply NOLOCK in the your query if there is no concern with data in the tables being updated during the SSIS package operation. That is, this works if there is no concern with "dirty reads." See following link and example.

    SELECT FirstName, LastName FROM dbo.Customer WITH(NOLOCK)

  3. Review this link for best practices for improving query performance. There may be additional steps you have not taken or tools you have not applied that can greatly assist in improving performance.

Community
  • 1
  • 1
user3662215
  • 526
  • 4
  • 12
0

The problem is getting clearer.
It's related to how sql decide to use a serial execution plan or a parallel execution plan. That's the optimizer's job. It turns out that I have two tasks in VisualCron that are scheduled to run, they both will run the same big query. The difference is they will get different input parameters.
The first one get parameters that will not deal with too much data.
The second one get parameters that deal with a big amount of data.

I assume the SQL optimizer first see the submitted query, and the query will not get too much data, so it decide to use serial plan.

I guess the plan for this same query is cached, so when it check the second submitted query, the optimizer might check the cache to see if any past evaluated plan exist for this query. Then, if it exist, it will use it. That's why it still choose to use serial plan for the second query.

After I change the order of the two task(I execute the one which will deal with more data first, and then the one that deal with less data), it works, it is now using parallel plan for both. (You may need to restart the instance to clear the cached execution plans)

How the optimizer works is still my assumption.

Other people's post, explaining how optimizer are playing a important role here

http://web.archive.org/web/20180404164406/http://sqlblog.com/blogs/paul_white/archive/2011/12/23/forcing-a-parallel-query-execution-plan.aspx

Community
  • 1
  • 1
Jin Tiger
  • 125
  • 1
  • 7