-1

I am curios about capabilities of relational DBMS to avoid extra copying while selecting from columns of fixed size types (e.g. long, timestamps, double).

These arrays are used as-is by libraries such as numpy.

create table t(d float8, s text, ts timestamp);
from multiprocessing.managers import SharedMemoryManager
from multiprocessing.shared_memory import SharedMemory
from typing import Tuple
import numpy as np
import psycopg2

conn = psycopg2.connect("dbname=test user=postgres")
cur = conn.cursor()

with SharedMemoryManager() as smm:
  shared_mem = smm.SharedMemory(size = 1000 * sizeof(double))
  arr = create_np_array_from_shared_mem(shared_mem, dtype('float64'), (1000,))
      
  cur.execute(
    "select d from t where ts between now() - interval '1 month' and now() limit 1000"
  )

  cur.fetchAsIsIntoMemory(shared_mem.buf)

  arr = np.frombuffer(shared_mem.buf, dtype=dtype('float64'))
  arr = arr.reshape((1000,))
  print("SUCCESS!!!")

I expect fetchAsIsIntoMemory makes the driver to use memory from the pointer as a network buffer.

Daniil Iaitskov
  • 5,525
  • 8
  • 39
  • 49
  • Your question contains tags for `postgresql` and `google-cloud-bigquery`, are you using bigquery for the above code? – kiran mathew Aug 04 '23 at 13:46
  • in my work I deal with these systems, but I am curious about any attempt regardless of specific software. – Daniil Iaitskov Aug 04 '23 at 18:14
  • Hi @Daniil Iaitskov,If you are using BigQuery, I recommend you to try using BigQuery [official client libraries](https://cloud.google.com/bigquery/docs/reference/libraries) for a faster, more secure, and efficient experience. – kiran mathew Aug 09 '23 at 12:27
  • BigQuery encodes response as JSON only. For getting a response in Avro or Parket format it should be stored into blob store! – Daniil Iaitskov Aug 09 '23 at 18:34
  • Could you take a look at this method [fetchmany](https://stackoverflow.com/questions/5189997/) if you are concerned about performance? cursor.fetchmany(size) returns the number of rows specified by the size argument. When called repeatedly, this method fetches the next set of rows of a query result and returns a list of tuples. – kiran mathew Aug 14 '23 at 13:40

0 Answers0