0

The below example demonstrates using dask delayed funtions (ref) from within postgres plpython while using "plpy.execute" (ref) to query the database.

It returns an error:

ERROR: spiexceptions.StatementTooComplex: stack depth limit exceeded

Any idea on what I'm doing wrong? I'm guessing it has something to do with delayed function's async nature and plpy.execute not liking that.

Versions:

  • postgresql 15
  • postgres's embedded python version 3.8

Example:

DO
LANGUAGE plpython3u
$$

    # https://docs.dask.org/en/stable/dataframe-sql.html#delayed-functions
    from dask import delayed
    
    @delayed
    def do_it():
        rv = plpy.execute("select 2 as a") # << max stack depth limit
        return 0

    plpy.info(do_it().compute())

$$;

Traceback:

ERROR:  spiexceptions.StatementTooComplex: stack depth limit exceeded
HINT:  Increase the configuration parameter "max_stack_depth" (currently 7168kB), after ensuring the platform's stack depth limit is adequate.
CONTEXT:  Traceback (most recent call last):
  PL/Python anonymous code block, line 10, in <module>
    plpy.info(do_it().compute())
  PL/Python anonymous code block, line 313, in compute
  PL/Python anonymous code block, line 598, in compute
  PL/Python anonymous code block, line 88, in get
  PL/Python anonymous code block, line 510, in get_async
  PL/Python anonymous code block, line 318, in reraise
  PL/Python anonymous code block, line 223, in execute_task
  PL/Python anonymous code block, line 118, in _execute_task
  PL/Python anonymous code block, line 7, in do_it
    rv = plpy.execute("select 2 as a") # << max stack depth limit
PL/Python anonymous code block

Updates:

  • added traceback
  • made more minimal
Shadi
  • 9,742
  • 4
  • 43
  • 65
  • Hi @Shadi, I'm just wondering what you would want to use Dask Delayed into Postgres' embedded Python? Code there is supposed to be minimal right? – Guillaume EB Feb 13 '23 at 15:37
  • 1) to perform dask operations on postgres data without leaving the server, 2) yes – Shadi Feb 13 '23 at 22:34
  • What I mean is, will you want to parallelize the code inside Postgres Python? Why don't you just write plain sequential Python code? – Guillaume EB Feb 15 '23 at 20:15
  • yes, to save the step of moving data out of the postgres server – Shadi Feb 15 '23 at 20:45
  • Well, I understand that, my point is: do you really need Dask for a Postgres' embedded computation? This doesn't sound appropriate. – Guillaume EB Feb 23 '23 at 10:46

0 Answers0