2

I am pretty new with SQL, but I need to work out how to make a whole bunch of existing queries I have inherited faster.

Here is there WHERE clause for one of them:

 @SearchFor nvarchar(200)        
 ,@SearchIn nvarchar(1024)      
 ,@SearchActivity int   

    -- stuff here I've left out ---

WHERE
    ( -- filter for all queries --
            (FT_TBL.IsActive = 1)
        AND (FT_TBL.IsPro = 1)
        AND (
            (FT_TBL.DatePaidUpTo >= @CurrentTime)
            OR (
                (FT_TBL.IsFromPaidBusinessDB = 1) 
                AND (FT_TBL.DatePaidUpTo IS NULL)
            )
        )
        AND (aspnet_Membership.IsApproved = 1)
    )       
    AND ( -- filter if user fills in 'searchfor' box --
           (@SearchFor IS NULL)
        OR (FT_TBL.CompanyName like '%' + @SearchFor + '%') 
        OR (aspnet_Users.UserName like '%' + @SearchFor + '%')
        OR (Activities.Activity like '%' + @SearchFor + '%')
    )
    AND ( -- filter if user fills in 'searchIn' box --
           (@SearchIn IS NULL)
        OR (a1.City LIKE '%' + @SearchIn + '%') 
        OR (a1.Region LIKE '%' + @SearchIn + '%') 
        OR (a1.Postcode LIKE '%' + @SearchIn + '%') 
        OR (Country.Name LIKE '%' + @SearchIn + '%')
    )
    AND ( -- filter by activity --
           (@SearchActivity IS NULL)
        OR (@SearchActivity = Activities.ActivityID)
    )

    AND NOT EXISTS (Select a2.AddressId, a2.UserId
            from Addresses a2
            where a2.userid = a1.UserId
            and a2.addressid < a1.addressid
    )

SearchIn, SearchFor, and SearchActivity are three fields that that can passed through to filter search results. The idea is that if 'null' is passed through for each of these, the only constraints on the search results come from the first block in the WHERE clause. If the any of these 3 fields are not null, the results are further constrained based on location, name,or category for that row. The final block is a little trick - users can have several addresses attached to them, but there should only be one row returned for each user. So this block just picks the address with the lowest ID.

At the moment this query runs very slowly - partly because our hardware is underpar, but also, I suspect, because this proc isn't efficient enough. I am trying to work out ways to make it better - difficult as I am trying to learn how SQL works at the same time!

One of the ideas I had was to try doing the search in two stages - For instance, to first do a query with only the first block of the WHERE clause, and then do a second query on the resultant table with the rest of the blocks in the WHERE clause. As the initial block filters out a lot of the rows in the table, I thought this could help.

Can someone suggest a good way to go about improving this query? Also, what tools are best used to test the efficiency of a query? sometimes the execution time can vary wildly, even for the same inputs.

Oliver
  • 11,297
  • 18
  • 71
  • 121
  • 2
    FYI, none of your `LIKE` comparisons can use an index, so I'm counting a minimum of **5 table scans** from that alone... – JNK Aug 09 '11 at 17:40
  • I don't really see how I can do it any other way though - its a search based on user text input. – Oliver Aug 09 '11 at 17:55
  • When you say 'a minimum of 5 table scans', is that only when `@SearchFor` and `@SearchIn` are not null? I was hoping that by putting the `@Search_ is NULL` expression at the beginning of the list of ORs the system would not try to process the various likes in this case. – Oliver Aug 09 '11 at 18:09
  • `OR` does not always short circuit. `SQL` doesn't work like an iterative language. Both sides of an `OR` statement may or may not be evaluated by the engine depending on the predicted performance. – JNK Aug 09 '11 at 18:10
  • Ouch, that makes it more difficult. I'll see if I can work out something with IF/CASE to avoid this structure then. – Oliver Aug 09 '11 at 18:19
  • `CASE` is one of the only guaranteed short circuits in SQL Server. – JNK Aug 09 '11 at 18:20

4 Answers4

4

Some points:

  1. First of all: Learn reading execution plans! Paste your final query into the SSMS and let SQL Server show you the execution plan for your query. Then look which parts take up the most time (typically table scans and clustered index scans). These are the points where you should have a closer look.

  2. Optimize your query so it can use indexes. That is, get rid of the LIKE '%value%' parts in your query. If possible, enforce users to do only begins with or exact match searches. The contains syntax begins to do performance problems when your table starts to have some 10k rows.

  3. Drop that last NOT EXISTS block, as it is very expensive. Do that filtering on result display.

  4. Add indexes to columns, where appropriate. Depending on your SQL Server version, it might even give you some hints regarding missing indexes.

  5. If you can't really identify the bottleneck of your query, start striping parts of the query away and watch the effects on performance and the changes in execution plan.

MicSim
  • 26,265
  • 16
  • 90
  • 133
  • Thanks for your help. I can't really do smething with the `LIKE '%value%'` but I can do something about the `NOT EXISTS` Block. I'll try looking into how the execution plan works too. – Oliver Aug 09 '11 at 18:17
2

As far as I can see, there is a a general problem with queries, that contain a lot of WHERE conditions. The SQLServer may not be able to find a good execution plan for your query, since there may be too many columns for your search required.

Also a LIKE with '%' + searchWord + '%' may return the values that you need, but effectively prevends Indexusage, since the beginning '%' means that everything has to be searched. Maybe you can gather the most used shearch scenarios and optimize for that (collecting statistics, looking at the execution plans , create indexes for these etc.). To have one query that does everything is always hard to optimize.

Maybe you can put a ClusteredIndex on the Column that limits the result most (e.g. a Date or something) to avoid tablescans.

Bernhard Kircher
  • 4,132
  • 3
  • 32
  • 38
  • Thanks for your comments. Due to the requirements for this query it is difficult to avoid using `LIKE` like that. I'll try putting indexes on some of the other fields though. – Oliver Aug 09 '11 at 18:13
1

You need to understand that how you write a query affects weather an index can be used or not. by reading, learning, and using these techniques: Dynamic Search Conditions in T-SQL by Erland Sommarskog you will have much better chance of getting an index used (resulting in faster queries).

if you are running the latest verison of SQ: Server 2008, this is the quick answer (which is a technique described in the previous link): How can I use optional parameters in a T-SQL stored procedure?

Community
  • 1
  • 1
KM.
  • 101,727
  • 34
  • 178
  • 212
0

When I'm facing a problem with unavoidably slow queries, I'll split the query into two stages.

Stage one retrieves a list of record IDs into a @table variable, applying only those WHERE clauses that I know are indexed and fast.

Stage two joins the @table against the full many-fielded many-joined query, where I can then apply slower filters such as LIKE clauses and UDF calls. Stage two is fast because it applies the slow filters only against a small recordset.