4

I have a function to get the balance for a customer.

CREATE OR REPLACE FUNCTION default1.get_balance (par_customer_id DECIMAL(31, 0))
  RETURNS DECIMAL(31,15)
  LANGUAGE SQL
  DETERMINISTIC
  NO EXTERNAL ACTION
  READS SQL DATA
BEGIN

  DECLARE var_balance DECIMAL(31,15); 

  SELECT SUM(amount)
  INTO var_balance
  FROM default1.accounting accounting 
  WHERE accounting.customer_id  = par_customer_id
    AND (YEAR(accounting.accounting_date) >= YEAR(SYSDATE)-3 or accounting.accounting_date IS NULL)
    AND paid_date IS NULL
    AND accounting_type_id <> 2
    AND NOT EXISTS (
        SELECT 1 
        FROM default1.accounting_detail detail 
        WHERE accounting.id = detail.accounting_id 
        AND detail.paid_date IS NOT NULL);

  RETURN var_balance;

END

The performance for getting the balance of one customer is good, but using the function in a query to get the balance for multiple customers at once it gets really slow.

SELECT default1.get_balance(customer.id), customer.*
FROM default1.customer customer
WHERE customer.id < 1000

This query takes over 2 minutes to execute.

When I replace the function in the query with a subselect it is much faster.

SELECT 
  (SELECT SUM(amount)
    FROM default1.accounting accounting
    WHERE accounting.customer_id = customer.id
    AND (YEAR(accounting.accounting_date) >= YEAR(SYSDATE)-3 or accounting.accounting_date IS NULL)
    AND paid_date IS NULL
    AND accounting_type_id <> 2
    AND NOT EXISTS (
        SELECT 1 
        FROM default1.accounting_detail detail 
        WHERE accounting.id = detail.accounting_id 
        AND detail.paid_date IS NOT NULL)),
  customer.*
FROM 
    default1.customer customer
WHERE customer.id < 1000

This query takes about 8 seconds.

I did execute the both queries multiple times in different orders without any significant change in the runtime. So I don't think that it is a caching issue.

Why does the query with the function takes about 15 times longer than the query with the subselect?
Is there anything I can change in the function to make it faster?

raznagul
  • 355
  • 2
  • 19

1 Answers1

2

I'm assuming DB2 for LUW.

Performance of your function can suffer because it uses a compiled compound statement as its body (BEGIN ... END). Try using an inlined compound statement: BEGIN ATOMIC ... END. Even better, you can probably simply use the RETURN statement only:

CREATE OR REPLACE FUNCTION default1.get_balance (par_customer_id DECIMAL(31, 0))
  RETURNS DECIMAL(31,15)
  LANGUAGE SQL
  NOT DETERMINISTIC
  NO EXTERNAL ACTION
  READS SQL DATA
RETURN SELECT SUM(amount)
  INTO var_balance
  FROM default1.accounting accounting 
  WHERE accounting.customer_id  = par_customer_id
    AND (YEAR(accounting.accounting_date) >= YEAR(SYSDATE)-3 or accounting.accounting_date IS NULL)
    AND paid_date IS NULL
    AND accounting_type_id <> 2
    AND NOT EXISTS (
        SELECT 1 
        FROM default1.accounting_detail detail 
        WHERE accounting.id = detail.accounting_id 
        AND detail.paid_date IS NOT NULL);

When using a compiled compound statement, each invocation of the function causes a context switch from the SQL data access engine to the PSM execution engine and back, while the inlined statement becomes a part of the query plan itself.

Note that you shouldn't declare this function as DETERMINISTIC, because it's not; mis-declaring a non-deterministic function could cause unexpected results.

mustaccio
  • 18,234
  • 16
  • 48
  • 57