10

In a SQL SELECT statement, I'd like to execute a function that is deterministic for the scope of that SELECT statement (or transaction would be ok, too):

select t.x, t.y, my_function(t.x) from t

Many values of t.x are the same so Oracle could omit calling the same function again and again, to speed things up. But if I label the function as DETERMINISTIC, the results may be cached between several executions of this query. The reason why I can't use DETERMINISTIC is because my_function uses a configuration parameter that is changed from time to time.

Is there any other keyword I could use? Are there any catches that I should be aware of (memory issues, concurrency, etc)? Or maybe any other tricks, such as analytic functions to call the function only once per t.x value (without major performance impact)?

Lukas Eder
  • 211,314
  • 129
  • 689
  • 1,509

4 Answers4

11

If you do this:

select t.x, t.y, (select my_function(t.x) from dual)
from t

then Oracle can use subquery caching to achieve reduced function calls.

Tony Andrews
  • 129,880
  • 21
  • 220
  • 259
  • 1
    Very nice, according to Tom's explanations, this seems to be even more efficient than marking the function as `DETERMINISTIC` – Lukas Eder Sep 01 '11 at 13:34
  • Interesting, too: `RESULT_CACHE` probably does about what I wanted, but the scalar subquery caching seems much more intuitive and straight-forward, and even faster! – Lukas Eder Sep 01 '11 at 13:41
  • 3
    Good answer, but I find it a little disturbing actually. Something that seems like a purely syntactical change can alter the results of the query (given that the function is non-deterministic). – Dave Costa Sep 01 '11 at 14:02
  • @Dave Costa, yes these things are undergoing constant change in Oracle. See also [this question](http://stackoverflow.com/questions/6856743/semantics-of-oracle-stored-procedures-functions-in-a-transactional-context) I've recently asked. Lots of implicit knowledge is required at the interface of `SQL` and `PL/SQL` – Lukas Eder Sep 01 '11 at 14:07
3

This is not an answer to your question, but can be a solution for you. This configuration parameter that you've mentioned, can't be added as a parameter to function? In this case, my_function(t.x, val1) is a different thing vs my_function(t.x, val2).

Florin Ghita
  • 17,525
  • 6
  • 57
  • 76
2

A possible simplistic workaround would be to create a second, DETERMINISTIC function that calls the first one; but have the second function take an additional, meaningless parameter, for which you provide a different literal value in each query that uses the function.

Dave Costa
  • 47,262
  • 8
  • 56
  • 72
  • 1
    Awesome idea! That trick will work! So I'll call the `DETERMINISTIC` `my_function_2(t.x, sysdate)` – Lukas Eder Sep 01 '11 at 13:14
  • Doesn't sound like this would increase the maintainability of the project. – Erich Kitzmueller Sep 01 '11 at 13:22
  • @ammoQ: Yes you're right in principle. But this is about a very remote offline job, not very central function. But it generates way too much load on the database for what it really does. I can live with a little tweak... – Lukas Eder Sep 01 '11 at 13:31
  • @Dave Costa, I've switched accepted answers. There's a very interesting article on scalar subquery caching! – Lukas Eder Sep 01 '11 at 13:46
2

Another method is to put the function in a package and set the result as a global variable. Then when you call the function check whether the input variables are the same as before and quickly return the global variable if so:

SQL> create or replace package temp is
  2
  3    function blah ( PIndex integer ) return integer;
  4
  5  end temp;
  6  /

Package created.

SQL>
SQL> create or replace package body temp is
  2
  3    GResult integer := 0;
  4    GIndex integer;
  5
  6    function blah ( PIndex integer ) return integer is
  7
  8      begin
  9
 10        if Gindex = Pindex then
 11          return Gresult;
 12        else
 13          GIndex := Pindex;
 14          GResult := Pindex;
 15        end if;
 16
 17        return Gresult;
 18
 19      end blah;
 20
 21  end temp;
 22  /

Package body created.

SQL>
SQL> select temp.blah(1) from dual;

TEMP.BLAH(1)
------------
           1

SQL>
SQL> select temp.blah(1) from dual;

TEMP.BLAH(1)
------------
           1

SQL>
SQL> select temp.blah(2) from dual;

TEMP.BLAH(2)
------------
           2

SQL>
Ben
  • 51,770
  • 36
  • 127
  • 149
  • Unfortunately, the global variable may live longer than the time between two changes of the configuration value. So this may produce wrong results... – Lukas Eder Sep 01 '11 at 13:36
  • @Lukas, you know your own data best :-). I haven't seen it ever produce incorrect results for me but we don't have anything where the execution time is important to the result. – Ben Sep 01 '11 at 13:40
  • It's still a nice idea, though. But check out the accepted answer. Scalar subquery caching is the optimal solution – Lukas Eder Sep 01 '11 at 13:48
  • I'm not sure I would trust Tom in this case. I use Globals for caching dynamic array results. So the first call would partially fill the array but then having all the information it needed return the correct value. The third call might find that it needs some more information to finish it's calculation. It'll add some more information to the array - normally calculation's done in memory - and then return etc. – Ben Sep 01 '11 at 13:54
  • I think Tom's example is really about "semi-deterministic" functions, that are not formally deterministic but don't need to be executed for every result set row - just as in my question... – Lukas Eder Sep 01 '11 at 13:59