Wildcard Pattern usage

Question

I've had a SQL performance review done on a project we're working on, and one 'Critical' item that has come up is this:

This kind of wildcard query pattern will cause a table scan, resulting in poor query performance.

 SELECT *
 FROM TabFoo
 WHERE ColBar = @someparam OR @someparam IS NULL

Their recommendation is:

In many cases, an OPTION (RECOMPILE) hint can be a quick workaround. From a design point of view, you can also consider using separate If clauses or (not recommended) use a dynamic SQL statement.

Dynamic SQL surely isn't the right way forward. Basically the procedure is one where I am search for something, OR something else. Two parameters come into the procedure, and I am filtering on one, or the other.

A better example than what they showed is:

SELECT ..
FROM...
WHERE (ColA = @ParA OR @ColA IS NULL)
(AND ColB = @ParB OR @ParB IS NULL)

Is that bad practice, and besides dynamic SQL (because, I thought dynamic sql can't really compile and be more efficient in it's execution plan?), how would this best be done?

The only wildcard I see is in the select clause. It's always good practice to list out (and minimize) the columns returned. Is that what your review is talking about? — Malk, Jan 15 '14 at 00:52
the term wildcard is misleading, he's referring to the where clause technique being used (which i've used a number of times myself) — CodeMonkey1313, Jan 15 '14 at 00:54
I don't know about performance issues, but I stay away from SELECT * in VIEWS as they can break when the underlying table changes (See http://www.mssqltips.com/sqlservertip/1427/table-changes-not-automatically-reflected-in-a-sql-server-view/). Also see the answer http://stackoverflow.com/questions/3639861/why-is-select-considered-harmful. — thinkOfaNumber, Jan 15 '14 at 00:59
I use this pattern all the time. It does not cause a table scan on indexed columns. However, you do need to be worried about parameter sniffing. I have seen it cripple stored procedures before. The OPTION (RECOMPILE) hint helps, but I just make it a habit to copy parameter variables into local variables first thing. — Malk, Jan 15 '14 at 01:02
The wildcard term makes sense to me. If you extend the idea of filename wildcards like `dir *.*` to the world of SQL with `select *` I can see how you'd think of this as a "wildcard" for all rows. — shawnt00, Sep 22 '20 at 08:10

score 2 · Accepted Answer · answered Jan 15 '14 at 02:00

A query like

select *
from foo
where foo.bar = @p OR @p is null

might or might not cause a table scan. My experience is that it will not: the optimizer perfectly able to do an index seek on the expression foo.bar = @p, assuming a suitable index exists. Further, it's perfectly able to short-circuit things if the variable is null. You won't know what your execution plan looks like until you try it and examine the bound execution plane. A better technique, however is this:

select *
from foo
where foo.bar = coalesce(@p,foo.bar)

which will give you the same behavior.

If you are using a stored procedure, one thing that can and will bite you in the tookus is something like this:

create dbo.spFoo

  @p varchar(32)

as

  select *
  from dbo.foo 
  where foo.bar = @p or @p = null

  return @@rowcount

The direct use of the stored procedure parameter in the where clause will cause the cached execution plan to be based on the value of @p on its first execution. That means that if the first execution of your stored procedure has an outlier value for @p, you may get a cached execution plan that performs really poorly for the 95% of "normal" executions and really well only for the oddball cases. To prevent this from occurring, you want to do this:

create dbo.spFoo

  @p varchar(32)

as

  declare @pMine varchar(32)
  set @pMine = @p

  select *
  from dbo.foo 
  where foo.bar = @pMine or @pMine = null

  return @@rowcount

That simple assignment of the parameter to a local variable makes it an expression and so the cached execution plan is not bound to the initial value of @p. Don't ask how I know this.

Further the recommendation you received:

In many cases, an OPTION (RECOMPILE) hint can be a quick workaround. From a design point of view, you can also consider using separate If clauses or (not recommended) use a dynamic SQL statement.

is hogwash. Option(recompile) means that the stored procedure is recompiled on every execution. When the stored procedure is being compiled, compile-time locks on taken out on dependent object. Further, nobody else is going to be able to execute the stored procedure until the compilation is completed. This has, shall we say, negative impact on concurrency and performance. Use of option(recompile) should be a measure of last resort.

Write clean SQL and vet your execution plans using production data, or as close as you can get to it: the execution plan you get is affected by the size and shape/distribution of the data.

score 1 · Answer 2 · answered Jan 15 '14 at 00:53

I could be wrong, but I'm pretty sure a table scan will occur no matter what if the column you have in your where clause isn't indexed. Also, you could probably get better performance by reordering your OR clauses so that if @ParA IS NULL is true, it evaluates first and would not require evaluating the value in the column. Something to remember is that the where clause is evaluated for every row that comes back from the from clause. I would not recommend dynamic SQL, and honestly, even under relatively heavy load I'd find it difficult to believe that this form of filter would cause a significant performance hit, since a table scan is required anytime the column isn't indexed.

The SQL standard allows the optimizer free reign to evaluate things in the where clause in any order it so chooses, so long as the semantics of the parse tree are maintained. `where A = B or B is null` is semantically identical to `where B is null or A = B` and should result in the same execution plan. That being said, I have, on rare occasions, seen situations where reordering the parts of a [usually very complicated] WHERE clause *did* result in different execution plans. — Nicholas Carey, Jan 15 '14 at 02:05

score 1 · Answer 3 · answered Sep 14 '20 at 21:00

We did a Microsoft engagement where they noted that we had a ton of this "Wildcard Pattern Usage", and their suggestion was to convert the query to an IF/ELSE structure...

    IF (@SomeParam is null) BEGIN
        SELECT *
        FROM TabFoo
    END
    ELSE BEGIN
        SELECT *
        FROM TabFoo
        WHERE ColBar = @someparam
    END

They preferred this approach over recompile (adds to execution time) or dynamic code (can't plan ahead, so kind of the same thing, having to figure out the plan every time); and I seem to recall that it is still an issue even with local variables (plus, you need extra memory regardless).

You can see that things get a bit crazy if you write queries with multiple WPU issues, but at least for the smaller ones, MS recommends the IF/ELSE approach.

In all the examples I saw, NULL was involved, but I can't help but think if you had a parameter utilizing a default, whether on the parameter itself or set with an ISNULL(), and essentially the same pattern used, that might also be bad (well, as long as the default is something an "actual value" would never be, that is).

Wildcard Pattern usage

3 Answers3