Using dask delayed function from within postgresql plpython with "plpy.execute"

Question

The below example demonstrates using dask delayed funtions (ref) from within postgres plpython while using "plpy.execute" (ref) to query the database.

It returns an error:

ERROR: spiexceptions.StatementTooComplex: stack depth limit exceeded

Any idea on what I'm doing wrong? I'm guessing it has something to do with delayed function's async nature and plpy.execute not liking that.

Versions:

postgresql 15
postgres's embedded python version 3.8

Example:

DO
LANGUAGE plpython3u
$$

    # https://docs.dask.org/en/stable/dataframe-sql.html#delayed-functions
    from dask import delayed
    
    @delayed
    def do_it():
        rv = plpy.execute("select 2 as a") # << max stack depth limit
        return 0

    plpy.info(do_it().compute())

$$;

Traceback:

ERROR:  spiexceptions.StatementTooComplex: stack depth limit exceeded
HINT:  Increase the configuration parameter "max_stack_depth" (currently 7168kB), after ensuring the platform's stack depth limit is adequate.
CONTEXT:  Traceback (most recent call last):
  PL/Python anonymous code block, line 10, in <module>
    plpy.info(do_it().compute())
  PL/Python anonymous code block, line 313, in compute
  PL/Python anonymous code block, line 598, in compute
  PL/Python anonymous code block, line 88, in get
  PL/Python anonymous code block, line 510, in get_async
  PL/Python anonymous code block, line 318, in reraise
  PL/Python anonymous code block, line 223, in execute_task
  PL/Python anonymous code block, line 118, in _execute_task
  PL/Python anonymous code block, line 7, in do_it
    rv = plpy.execute("select 2 as a") # << max stack depth limit
PL/Python anonymous code block

Updates:

added traceback
made more minimal

Hi @Shadi, I'm just wondering what you would want to use Dask Delayed into Postgres' embedded Python? Code there is supposed to be minimal right? — Guillaume EB, Feb 13 '23 at 15:37
1) to perform dask operations on postgres data without leaving the server, 2) yes — Shadi, Feb 13 '23 at 22:34
What I mean is, will you want to parallelize the code inside Postgres Python? Why don't you just write plain sequential Python code? — Guillaume EB, Feb 15 '23 at 20:15
yes, to save the step of moving data out of the postgres server — Shadi, Feb 15 '23 at 20:45
Well, I understand that, my point is: do you really need Dask for a Postgres' embedded computation? This doesn't sound appropriate. — Guillaume EB, Feb 23 '23 at 10:46

Using dask delayed function from within postgresql plpython with "plpy.execute"

0 Answers0