1

This is not really a problem because I solved it. But, I wanted to share something that ate my brain today. So, run this query and check it out:

The query:

select 2 as ID1,0 as ID2 into #TEMP
insert into #TEMP (ID1,ID2)
values (2,1),(2,2),(2,0)

select
  case when min(case when ID1=2 and ID2=0 then 0
                     else 1
                end)=0 then 0
       else sum(log(ID2))
  end
from #TEMP

The fix:

select
  case when min(case when ID1=2 and ID2=0 then 0
                     else 1
                end)=0 then 0
       else sum(log(case when ID1=2 and ID2<>0 then ID2
                         else 1
                    end))
  end
from #TEMP

My query was larger and more difficult to debug, but what do you say about the plan that MSSQL is making and the fact that it gets it wrong with this query? How can it be modified to work except my little fix that I showed before? I am guessing that computing scalars before the query would make things slow if the scalars are not easy to compute and we compute for all values.

Josh Darnell
  • 11,304
  • 9
  • 38
  • 66
Dumitrescu Bogdan
  • 7,127
  • 2
  • 23
  • 31
  • I cant' even run this: *Msg 3623, Level 16, State 1, Line 7 An invalid floating point operation occurred.* What's wrong with the execution plan you get? – Yuck Jan 06 '12 at 16:03
  • `values (2,1),(2,2),(2,0)` - huh...I didn't know there was such a syntax for inserting multiple records. Thanks! – Dr. Wily's Apprentice Jan 06 '12 at 16:04
  • This is the error .. but if you check the code, there should be no error. The problem is that sql calculates all the scalars and after that places them in the case, so it will calculate log(0) when log(0) should never be calculated.(based on the case) – Dumitrescu Bogdan Jan 06 '12 at 16:05

2 Answers2

2

SQL Server does not perform short circuit evaluation (i.e. should not be relied upon). It's a fairly well known problem.

Ref:

Sunny Patel
  • 7,830
  • 2
  • 31
  • 46
Mitch Wheat
  • 295,962
  • 43
  • 465
  • 541
  • Thanks, I knew about sql not performing short circuit evaluations, but this is not the case. But I didn't know about the CASE which I thought that it will be evaluated correct. Then again, if you look on the fix, if I put the condition again in the case then it will not fail. – Dumitrescu Bogdan Jan 06 '12 at 16:11
1

EDIT: I misunderstood the question with my original answer.

I supposed you could add a where clause, as shown below. Side note: this query might benefit from an index on (ID1, ID2).

select
  sum(log(convert(float, ID2)))
from #TEMP
where
    -- exclude all records with ID1 = 2 if there are any records with ID1 = 2 and ID2 = 0
    not exists (
        select 1
        from #TEMP NoZeros
        where
            NoZeros.ID1 = #TEMP.ID1
            and NoZeros.ID2 = 0
    )

Update: Just in case performance is a concern, I got fairly comparable performance from this query after adding the following indexes:

create index ix_TEMP_ID1_ID2 on #TEMP (ID1, ID2)
create index ix_TEMP_ID2 on #TEMP (ID2) include (ID1)

Original Answer

How about modifying your sum function as shown below?

sum(case ID2 when 0 then null else log(convert(float, ID2)) end)
Dr. Wily's Apprentice
  • 10,212
  • 1
  • 25
  • 27