3

I've got a query that uses several subqueries. It's about 100 lines, so I'll leave it out. The issue is that I have several rows returned as part of one subquery that need to be joined to an integer value from the main query. Like so:

Select 
... columns ... 
from 
... tables ... 
        (
        select 
        ... column ... 
        from 
        ... tables ...  
        INNER JOIN core.Type mt 
                        on m.TypeID = mt.TypeID 
                    where dpt.[DataPointTypeName] = 'TheDataPointType'
                        and m.TypeID in (100008, 100009, 100738, 100739) 
                        and datediff(d, m.MeasureEntered, GETDATE()) <  365 -- only care about measures from past year 
                        and dp.DataPointValue <> ''
                        ) as subMdp                     
         ) as subMeas 
    on (subMeas.DataPointValue NOT LIKE '%[^0-9]%'  
        and subMeas.DataPointValue = cast(vcert.IDNumber as varchar(50))) -- THIS LINE
... more tables etc ...     

The issue is that if I take out the cast(vcert.IDNumber as varchar(50))) it will attempt to compare a value like 'daffodil' to a number like 3245. Even though the datapoint that contains 'daffodil' is an orphan record that should be filtered out by the INNER JOIN 4 lines above it. It works fine if I try to compare a string to a string but blows up if I try to compare a string to an int -- even though I have a clause in there to only look at things that can be converted to integers: NOT LIKE '%[^0-9]%'. If I specifically filter out the record containing 'daffodil' then it's fine. If I move the NOT LIKE line into the subquery it will still fail. It's like the NOT LIKE is evaluated last no matter what I do.

So the real question is why SQL would be evaluating a JOIN clause before evaluating a WHERE clause contained in a subquery. Also how I can force it to only evaluate the JOIN clause if the value being evaluated is convertible to an INT. Also why it would be evaluating a record that will definitely not be present after an INNER JOIN is applied.

I understand that there's a strong element of query optimizer voodoo going on here. On the other hand I'm telling it to do an INNER JOIN and the optimizer is specifically ignoring it. I'd like to know why.

jcollum
  • 43,623
  • 55
  • 191
  • 321
  • Explained here http://stackoverflow.com/questions/5191701/tsql-divide-by-zero-encountered-despite-no-columns-containing-0 – Martin Smith Aug 26 '11 at 21:46
  • Did you not read my link? Another example with a similar issue and shows you to how to get around it with a `case` expression. http://stackoverflow.com/questions/7192524/unable-to-cast-value-as-float/7192951#7192951 It isn't anything specifically to do with joins SQL Server will often order operations so the compute scalar happens before the filter. – Martin Smith Aug 29 '11 at 16:38
  • 1
    @Martin: that did it, I put the NOT LIKE in a case statement in the SELECT in the submeas subquery. I can write it in as an answer but you should get the credit, so write it up and I'll mark it as Answer. – jcollum Aug 29 '11 at 17:14

4 Answers4

2

So the real question is why SQL would be evaluating a JOIN clause before evaluating a WHERE clause contained in a subquery.

Because SQL engines are required to behave as if that's what they do. They're required to act like they build a working table from all of the table constructors in the FROM clause; expressions in the WHERE clause are applied to that working table.

Joe Celko wrote about this many times on Usenet. Here's an old version with more details.

Mike Sherrill 'Cat Recall'
  • 91,602
  • 17
  • 122
  • 185
  • I don't feel this answers my question. There's one sentence in particular that isn't the behavior that I'm seeing: "Start in the FROM clause and build a working table from all of the joins, unions, intersections, and whatever other table constructors are there". I'm not seeing that; instead I'm seeing an evaluation of a where clause happening before an INNER JOIN. My question is how I can force the WHERE clause to only be evaluated on the results of the join. – jcollum Aug 29 '11 at 16:35
2

The problem you are having is discussed in this item of feedback on the connect site.

Whilst logically you might expect the filter to exclude any DataPointValue values that contain any non numeric characters SQL Server appears to be ordering the CAST operation in the execution plan before this filter happens. Hence the error.

Until Denali comes along with its TRY_CONVERT function the way around this is to wrap the usage of the column in a case expression that repeats the same logic as the filter.

Martin Smith
  • 438,706
  • 87
  • 741
  • 845
  • Good lord, SQL optimizer voodoo is killing me. I have a straight join after the subquery `(subMeas.DataPointValue = vcert.IDNumber)`and I get results. I then comment out the line `and dp.DataPointValue <> ''` and it errors on the value '00spr' (which is an orphan record that should be filtered by the JOIN *AND* wouldn't have been filtered by the `<> ''` anyway. WTF. – jcollum Aug 29 '11 at 18:00
  • @jcollum - If you look at the two execution plans you should probably see that the placement of a compute scalar operator changes somewhere. – Martin Smith Aug 29 '11 at 18:18
  • @jcollum, I guess it's not totally a problem of the optimizer, but due to ANSI NULLS. Try ISNULL(dp.DataPointValue,'') <> '' – Guillermo Gutiérrez Jan 22 '13 at 14:13
0

First of all,

 NOT LIKE '%[^0-9]%'

isn`t work well. Example:

DECLARE @Int nvarchar(20)= ' 454 54'
SELECT  CASE WHEN @INT LIKE '%[^0-9]%' THEN 1 ELSE 0 END AS Is_Number
Result: 1

But it is not a number!

To check if it is real int value , you should use ISNUMERIC function. Let`s check this:

DECLARE @Int nvarchar(20)= ' 454 54'
SELECT ISNUMERIC(@int) Is_Int
Result:0

Result is correct.

So, instead of

NOT LIKE '%[^0-9]%'

try to change this to

ISNUMERIC(subMeas.DataPointValue)=0

UPDATE

How check if value is integer? First here:

WHERE ISNUMERIC(str) AND str NOT LIKE '%.%' AND str NOT LIKE '%e%' AND str NOT LIKE '%-%'

Second:

CREATE Function dbo.IsInteger(@Value VarChar(18))
Returns Bit
As 
Begin

  Return IsNull(
     (Select Case When CharIndex('.', @Value) > 0 
                  Then Case When Convert(int, ParseName(@Value, 1)) <> 0
                            Then 0
                            Else 1
                            End
                  Else 1
                  End
      Where IsNumeric(@Value + 'e0') = 1), 0)

End
Community
  • 1
  • 1
Dalex
  • 3,585
  • 19
  • 25
  • `ISNUMERIC` doesn't guarantee that the value can be cast to an `int` and even if it did as `ISNUMERIC(subMeas.DataPointValue + 'e0')` does this wouldn't fix the OP's issue - Which is essentially the same as this one http://stackoverflow.com/questions/5191701/tsql-divide-by-zero-encountered-despite-no-columns-containing-0 – Martin Smith Aug 29 '11 at 08:44
  • And you seem to be missing the point of the original `'%[^0-9]%'` expression. It has a negation to find non numeric characters so the result from your test is entirely expected. You could equally well have used `fish` instead of `454 54` – Martin Smith Aug 29 '11 at 08:51
  • I got the `NOT LIKE` phrase from a blog post that was about ISNUMERIC not working like you expect. ISNUMERIC basically says 'can this be converted to any sort of number'. '454 54' could be converted to a number if you are using spaces for separators. Maybe it isn't very sensible way to display a number but it still work. – jcollum Aug 29 '11 at 16:15
-1

Filter out the non-numeric records in a subquery or CTE

JNappi
  • 1,495
  • 1
  • 14
  • 24