0

I have a dataframe that holds datetime values and arrays and possibly other datatypes in the future. I wish to to_sql it to PostgreSQL where datetime is a (timestamp without time zone) and arrays are (byte) types, but I have no idea what to put for the dtype argument.

Is there a way to dynamically do the dtype based on the dataframe column's datatype?

How the table looks:

CREATE TABLE IF NOT EXISTS data2_mcw_tmp (
      time timestamp without time zone,
      u bytea,
      v bytea,
      w bytea,
      spd bytea,
      dir bytea,
      temp bytea
);

My code so far (after from help from user rftr):

dtypedict = {}
Data2_mcw_conv = Data2_mcw.copy()

for row in Data2_mcw_conv.index:
  for col in Data2_mcw_conv.columns:

    value = Data2_mcw_conv[col][row]

    try:
      if type(Data2_mcw_conv[col].iloc[0]).__module__ == np.__name__:

        dtypedict.update({col:BYTEA})

        value = Data2_mcw_conv[col].loc[row]

        print('before: ')
        print (value.flags)
        print('---------------------')
          
        value = value.copy(order='C')

        print('after: ')
        print (value.flags)
        print('=====================')
          
        value = pickle.dumps(value)

    except:    
      if isinstance(Data2_mcw_conv[col].iloc[0], datetime.date):

        dtypedict.update({col:TIMESTAMP})
        
    Data2_mcw_conv[col][row] = value

Data2_mcw_conv.to_sql(name='data2_mcw_tmp',con=conn, 
                    if_exists = 'replace',
                    dtype=dtypedict)

However, I get this error:

Traceback (most recent call last):
  File "C:\Users\myname\Desktop\database\pickletopdb2.py", line 145, in <module>
    postgres_conv()
  File "C:\Users\myname\Desktop\database\pickletopdb2.py", line 124, in postgres_conv
    Data2_mcw_conv.to_sql(name='data2_mcw_tmp',con=conn,
  File "C:\Python38\lib\site-packages\pandas\core\generic.py", line 2778, in to_sql
    sql.to_sql(
  File "C:\Python38\lib\site-packages\pandas\io\sql.py", line 590, in to_sql
    pandas_sql.to_sql(
  File "C:\Python38\lib\site-packages\pandas\io\sql.py", line 1397, in to_sql
    table.insert(chunksize, method=method)
  File "C:\Python38\lib\site-packages\pandas\io\sql.py", line 831, in insert
    exec_insert(conn, keys, chunk_iter)
  File "C:\Python38\lib\site-packages\pandas\io\sql.py", line 748, in _execute_insert
    conn.execute(self.table.insert(), data)
  File "C:\Python38\lib\site-packages\sqlalchemy\engine\base.py", line 1286, in execute
    return meth(self, multiparams, params, _EMPTY_EXECUTION_OPTS)
  File "C:\Python38\lib\site-packages\sqlalchemy\sql\elements.py", line 325, in _execute_on_connection
    return connection._execute_clauseelement(
  File "C:\Python38\lib\site-packages\sqlalchemy\engine\base.py", line 1478, in _execute_clauseelement
    ret = self._execute_context(
  File "C:\Python38\lib\site-packages\sqlalchemy\engine\base.py", line 1842, in _execute_context
    self._handle_dbapi_exception(
  File "C:\Python38\lib\site-packages\sqlalchemy\engine\base.py", line 2027, in _handle_dbapi_exception
    util.raise_(exc_info[1], with_traceback=exc_info[2])
  File "C:\Python38\lib\site-packages\sqlalchemy\util\compat.py", line 207, in raise_
    raise exception
  File "C:\Python38\lib\site-packages\sqlalchemy\engine\base.py", line 1779, in _execute_context
    self.dialect.do_executemany(
  File "C:\Python38\lib\site-packages\sqlalchemy\dialects\postgresql\psycopg2.py", line 951, in do_executemany
    context._psycopg2_fetched_rows = xtras.execute_values(
  File "C:\Python38\lib\site-packages\psycopg2\extras.py", line 1267, in execute_values
    parts.append(cur.mogrify(template, args))
ValueError: ndarray is not C-contiguous

value.flag outputs before/after value = value.copy(order='C'):

before: 
  C_CONTIGUOUS : False
  F_CONTIGUOUS : True
  OWNDATA : False
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False
  UPDATEIFCOPY : False

---------------------
after: 
  C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False
  UPDATEIFCOPY : False

=====================

Why is this error occurring and any idea how to solve it?

xyiong
  • 363
  • 1
  • 10

1 Answers1

2

Follow the second code snippet of this answer and customize the dtypes to their PostgreSQL equivalents from here. So in your case, e.g.:

from sqlalchemy.dialects.postgresql import BYTEA, TIMESTAMP

def sqlcol(dfparam):
    # ...
    if "datetime" in str(j):
        dtypedict.update({i: TIMESTAMP})
    if "object" in str(j):  # Depending on what your other column's datatypes are
        dtypesdict.update({i: BYTEA})
    # ...

Notes:

  • According to the docs, TIMESTAMP is without timezone by default.
  • Your (byte) columns are usually represented with datatype object in pandas. You should take this into consideration, if you add further data in the future.
rftr
  • 1,185
  • 2
  • 10
  • 19
  • Thanks for the tips, I took in your advice and made some progress, but I encounter an error when I try to port arrays in, I've added the error and current code to my original question – xyiong Oct 29 '21 at 12:51
  • 1
    It seems like your arrays have a different order than `C`. So before adding your arrays to your resulting dataframe, try converting them with `arr = arr.copy(order='C')` with `arr` being your respective array. – rftr Oct 29 '21 at 13:21
  • Hi, thanks for the advice! I just tried that, but it still gives the same error. does it have anything to do with me doing a pickle.dump of the array? I need to do it as BYTEA is a datatype for binary data, not exactly arrays and there are no other ways to put numpy arrays into the postgresql database that I am aware of. – xyiong Oct 29 '21 at 13:49
  • Can you show the result with `print(value.flags)` after the line `value = value(order='C')` please? – rftr Oct 29 '21 at 14:11
  • I have added the output for before/after the ```value=value(order='C")``` line – xyiong Oct 29 '21 at 14:27
  • I found out the problem! I had to remove the ```value = pickle.dumps(value)``` line to get it to work. Thanks for your assistance! – xyiong Oct 29 '21 at 15:10