I am struggling to debug performance on a particular query. The query is this:
select count(*)
FROM dbo.user d
INNER JOIN dbo.distinct_first_name dfn ON (
[dbo].jw(dfn.first_name, 'john') > 0.8
AND
(d.first_name = dfn.first_name
OR d.nick_name = dfn.first_name
OR d.middle_name = dfn.first_name)
)
The query runs a Jaro Winkler filter on a distinct first name table (containing approx 15k rows) and then inner joins this against the user table to produce the result set. As defined, this takes around 1 minute to run with approx 500k rows in the user table.
Here's what I know:
1) The Jaro Winkler filter is almost instant (0.1s by itself)
2) If I change the user clause to only include one of the columns (i.e. remove the ORs) it takes only 0.4s
3) If I change this to three queries, and run them back to back, it takes approx 2s
4) If I change the Jaro Winkler filter to 0.99 (so that there's only one result) it makes no substantive difference in the query execution time
5) If I replace the Jaro Winkler filter with an equality operation (dfn.first_name = 'john') total query time is reduced to 4s
(All timings are on a fairly slow virt; real life performance will be better.)
So, for some reason, the combination of the function and the ORs are confusing the query optimizer. The execution plan is not very informative; it says that 90% of the query is spent on:
<RelOp NodeId="63" PhysicalOp="Clustered Index Seek" LogicalOp="Clustered Index Seek" EstimateRows="1.69029" EstimateIO="0.003125" EstimateCPU="0.000158859" AvgRowSize="17" EstimatedTotalSubtreeCost="71.4311" TableCardinality="15958" Parallel="0" EstimateRebinds="448881" EstimateRewinds="0.504024" EstimatedExecutionMode="Row">
<OutputList>
<ColumnReference Database="[mydb]" Schema="[dbo]" Table="[distinct_first_name]" Alias="[dfn]" Column="first_name" />
</OutputList>
<RunTimeInformation>
<RunTimeCountersPerThread Thread="0" ActualRows="857936" ActualEndOfScans="859454" ActualExecutions="859454" />
</RunTimeInformation>
<IndexScan Ordered="1" ScanDirection="FORWARD" ForcedIndex="0" ForceSeek="0" ForceScan="0" NoExpandHint="0" Storage="RowStore">
Splitting the query up is actually an option, since this is in a sproc, and I can probably redesign the schema a little, but I'm stumped as to what's bogging this down. Any ideas?