1

I'm writing a stored procedure with quite a lot of expensive work to do that may or may not take a filter parameter. Doing the filtering is itself quite expensive, and the table being filtered is large. I just tried to change the inner filtering function so throw an error if called with invalid parameters, as a warning to developers not to use it that way.

BUT - If I call my outer test function with NULL, it works as I'd expect, not calling the inner function and not throwing the error. If I call my outer test function with a variable with the VALUE of NULL, then it calls the filter function with a null parameter, and throws the error, even thought the code only says to call the function when the value is not null.

What's going on here?

Much simplified example:

IF  EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[MyTable]') AND type in (N'U')) DROP TABLE MyTable 
GO

CREATE TABLE MyTable (Pk int, Field int)
GO

INSERT INTO MyTable VALUES (1, 1)
INSERT INTO MyTable VALUES (2, 4)
INSERT INTO MyTable VALUES (3, 9)
INSERT INTO MyTable VALUES (4, 16)
GO

IF  EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[FilterRows]') AND type in (N'FN', N'IF', N'TF', N'FS', N'FT')) DROP FUNCTION FilterRows
GO
CREATE FUNCTION FilterRows(@searchParameter int)
RETURNS @Pks TABLE 
    (           
        Pk int
    )
AS 
BEGIN
    IF (@searchParameter IS null)
    BEGIN
        -- This is bad news. We don't want to be here with a null search, as the only thing we can do is return every row in the whole table
        -- RAISERROR ('Avoid calling FilterRows with no search parameter', 16, 1)       
        -- we can't raise errors in functions!
        -- Make it divide by zero instead then
        INSERT INTO @Pks SELECT Pk FROM MyTable WHERE 1/0 = 1
    END
    ELSE
    BEGIN
        INSERT INTO @Pks SELECT Pk FROM MyTable WHERE Field > @searchParameter
    END
    RETURN
END
GO


IF  EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[OuterFunction]') AND type in (N'FN', N'IF', N'TF', N'FS', N'FT')) DROP FUNCTION OuterFunction
GO
CREATE FUNCTION OuterFunction(@searchParameter int)
RETURNS TABLE AS
RETURN 
SELECT * 
FROM 
MyTable
WHERE
(@SearchParameter IS NULL) OR (@searchParameter IS NOT NULL AND Pk IN (SELECT Pk FROM dbo.FilterRows(@searchParameter)))
GO

SELECT * FROM dbo.OuterFunction(2) -- Returns filtered values
SELECT * FROM dbo.OuterFunction(null) -- returns everything, doesn't call FilterRows
DECLARE @x int = null
SELECT * FROM dbo.OuterFunction(@x) -- WTF! Throws error!
Ben Curthoys
  • 741
  • 7
  • 20
  • "Optional filters" are a design mistake because the optimizer will cache the execution plan generated for the *first* call. Instead of short-circuiting you'll end up with worse performance. The need for such things has gone away with ORMs and LINQ which generate queries with *only* the necessary filters. Performance is exactly the same whether you execute an ad-hoc query or an equivalent stored procedure – Panagiotis Kanavos Sep 08 '16 at 13:29
  • PS filtering is the opposite of expensive - it *reduces* the results. If it isn't, there is a problem with the query. More conditions result in *faster* executions if you have the proper indexes. "Short-circuit" attempts results in *slower* execution because they have to process everything to find what to ... electrocute – Panagiotis Kanavos Sep 08 '16 at 13:32
  • I have deliberately simplified the situation in order make a concise question. I have good reasons for doing what I'm doing and I know how the query analyzer works. The question I am asking is about unexpected behavior from a UDF. I am not asking to be patronizingly told that what I'm trying to achieve is a wrong thing to be trying to achieve. – Ben Curthoys Sep 08 '16 at 13:36
  • Then you should post that there is an error in the question text itself including the full error, not in the last line of the second page of the code. Trying to use "optional" filter parameters is a common mistake, one that I also did before realizing the consequences. *Where* does the error occur? – Panagiotis Kanavos Sep 08 '16 at 13:46
  • Ok. What I really have is a table representing folders, so the main table has a folder id and a parent folder id. I'm writing a query for a report, and the report can have a target folder, in which case it should run for that folder and include all its children and contents. – Ben Curthoys Sep 08 '16 at 13:49
  • If it was as simple as that, I could write a folder-children function as an inline function using a recursive CTE, and that would be fine. But, there are also folder shortcuts. A folder X can contain a shortcut to folder Y elsewhere in the tree, and the contents of Y need to be returned in the report too. I've tried and failed to write a recursive CTE that allows for shortcuts AND doesn't crash when a user sets up an infinite loop. – Ben Curthoys Sep 08 '16 at 13:49
  • So, I have a multistatement function that given a folder id, returns the folder ids of all children, including shortcuts and the children of shortcuts. This works fine on any node except for the root node of the tree, where even if I skip the looking up children processing and just return all folder ids in the table, it would have to return a table variable with too many rows in it to perform well. – Ben Curthoys Sep 08 '16 at 13:49

2 Answers2

0

The difference when a value null is passed than when constant null is passed is the same difference between using (is Null) and (= null)

@var = null -- considered as false

@var is null -- considered as unknown

for more details : SQL is null and = null

so if you want to make behavior of both (calling constant null & pass Null value) is the same, use the following tricky although I don't prefer this one.

Alter FilterRows function to be

IF (@searchParameter = null)
--IF (@searchParameter is null)

Note: sorry for typing this answer here, it is supposed to be comment instead of answer, the rule is "You must have 50 reputation to comment" and I have only 22 :(

Community
  • 1
  • 1
ahmed abdelqader
  • 3,409
  • 17
  • 36
  • I don't think that's right. DECLARE @x int = NULL; IF (@x IS NULL) PRINT 'X is null' prints X is null. Because testing @x IS NULL returns true. Testing @x = NULL returns null. IF (@searchParameter = null) will never return true, AFAIK. If you had a code snippet showing otherwise I'd be interseted! – Ben Curthoys Sep 09 '16 at 13:24
  • I tested my answer before posting it via using your example above (that located in your question), and also your code in the comment is correct also , something is wired ! – ahmed abdelqader Sep 09 '16 at 15:12
0

I think what's going on is that in

SELECT * FROM MyTable WHERE (@SearchParameter IS NULL) OR 
(@searchParameter IS NOT NULL AND Pk IN (SELECT Pk FROM dbo.FilterRows(@searchParameter)))

The query analyzer can see that the subquery

(SELECT Pk FROM dbo.FilterRows(@searchParameter))

does not depend on any values from MyTable. As it's constant for all rows, it runs that subquery first, in order to join MyTable to the results. So it executes it before evaluating the WHERE clause where it tests whether @searchParameter IS NULL or not.

When @searchParameter is just "NULL" and not a variable with value NULL, then the analyzer can short-circuit the whole where clause in the execution plan and so knows not to pre-calculate the subquery.

Or, something like that.

Ben Curthoys
  • 741
  • 7
  • 20