3

If I have a sql statement like this:

select * 
from tableA a
inner join tableB b on dbo.fn_something(a.ColX) = b.ColY

if you assume there are 5 rows in tableA with the same value for ColX will dbo.fn_something() be called with that value 5 times or just one time?

Clearly this is a trivial example, but I'm interested for the purposes of thinking about performance in a more complex scenario.

UPDATE Thanks @DStanley, following from your answer I investigated further. Using SQL Profiler with the SP:StmtStarting event on the SQL below illustrates what happens. i.e. as you said: the function will be called once for each row in the join.

This has an extra join from the original question.

create table tableA 
( id int ) 

create table tableB
( id_a int not null 
, id_c int not null 
) 

create table tableC
( id int ) 

go 
create function dbo.fn_something( @id int ) 
returns int 
as 
begin 
    return @id
end 
go 

-- add test data

-- 5 rows:
insert into tableA (id) values (1), (2), (3), (4), (5) 
-- 5 rows:
insert into tableC (id) values (101), (102), (103), (104), (105) 
-- 25 rows:
insert into tableB (id_a, id_c) select a.id, c.id from tableA a, tableC c

go 

-- here dbo.fn_something() is called 25 times: 
select * 
from tableA a
inner join tableB b on a.id = b.id_a
inner join tableC c on c.id = dbo.fn_something(b.id_c)

-- here dbo.fn_something() is called just 5 times, 
-- as the 'b.id_c < 102' happens to be applied first.
-- That's likely to depend on whether SQL thinks it's 
-- faster to evaluate the '<' or the function. 
select * 
from tableA a
inner join tableB b on a.id = b.id_a
inner join tableC c on c.id = dbo.fn_something(b.id_c) and b.id_c < 102

go

drop table tableA ; 
drop table tableB; 
drop table tableC; 
drop function dbo.fn_something;
go 
Rory
  • 40,559
  • 52
  • 175
  • 261

2 Answers2

6

It will be called for each row in a. I do not know of any optimization that would call the function just for unique inputs. If performance is an issue you could create a temp table with distinct input values and use thoce results in your join, but I would only do that it it was an issue - don't assume it's a problem and clutter your query unnecessarily.

D Stanley
  • 149,601
  • 11
  • 178
  • 240
  • Doesn't the answer depend on a little more detail than this? IE, inline TVF vs Multi-Statement TVF? – Brad D Mar 23 '15 at 17:52
  • With the help of @sqlacid here http://stackoverflow.com/a/352615/8479 I used SQL Profiler's SP:StmtStarting to verify that `dbo.fn_something` is called for every row in the join. – Rory Mar 23 '15 at 18:07
  • @BradD, I was only interested in scalar functions so updated title accordingly. – Rory Mar 23 '15 at 18:18
  • @BradD Based on the usage I was assuming a scalar function. – D Stanley Mar 23 '15 at 18:37
  • Also if you are writing scalar functions to be part of your joins, I would seriously look at fixing the database design. You should never have a database that can't directly join on the fields you intend to join on (and they should usually all be indexed). It is far better to have a persisted calculated field if you must, so the calculation is only done when the data is entered or changed, not everytime you call it in a join. – HLGEM Mar 23 '15 at 19:06
1

If you declare your function as schema bound, it can be run one for each unique case. This requires that the function be deterministic and always has the same output for a given input.

CREATE FUNCTION dbo.fn_something (@id INT)
RETURNS INT
WITH SCHEMABINDING
AS
BEGIN
    RETURN @id
END
GO
  • While SCHEMABINDING is indeed required, are you sure that the effect is to only run once per each unique input value? Have you tested it? My testing shows that the T-SQL UDF (the one you are showing here) runs once per row regardless of the value and regardless of whether or not `WITH SCHEMABINDING` was used. – Solomon Rutzky Mar 24 '15 at 04:57