0

Im facing a problem with pandas.to_sql().

My db : ibm db2 (bigsql)

table : hadoop table stored as parquet

Im connecting to bigsql and load some data from other db through pandas like this. read = pd.read_sql(text(item['load_sql']),session.bind) read.to_sql(table_,dst_session.bind,schema='dev',if_exists='append',chunksize=1424, index=False, method='multi')

It's everything ok when chunksize is less then 1425, but when its 1425 or higher im getting an error : (ibm_db_dbi.ProgrammingError) ibm_db_dbi::ProgrammingError: Exception('Statement Execute Failed: [IBM][CLI Driver] Wrong number of parameters. SQLSTATE=07001 SQLCODE=-99999')

I think its not related with sql command string size, becouse with chunksize=1424 is not even 1 mb.

Is there some kind of restriciton to chunksize with db2 / hadoop ? Someone knows why im getting this error ?

Resolved ?: I dont use to_sql() anymore - im creating whole insert command as a string and send it to db. It seems that only limitation in that option is string size but the insertion speed increased and i can put more rows in this command than chunksize allowed before.

Chan
  • 23
  • 2
  • 6
  • db2 may have an internal limitation similar to the Microsoft SQL Server limitation described [here](https://stackoverflow.com/a/50706219/2144390). – Gord Thompson Feb 20 '21 at 17:10
  • Thanks for your answer. I didnt find any information on parameters number limitation. But lests assume i reached the limit. Do i need to build own simple insert string without any parameters to speed it up ? The only limit is insert statement string size (2gb) here, so command builded like that should do more insert per statement than parametrized one. Anyway, if someone has link to the docs about that please share. – Chan Feb 20 '21 at 20:50

0 Answers0