I am writing python scripts to sychronize tables from a MSSQL database to a Postgresql DB. The original author tends to use super wide tables with a lot of regional consecutive NULL
holes in them.
For insertion speed, I serialized the records in bulk to string in the following form before execute()
INSERT INTO A( {col_list} )
SELECT * FROM ( VALUES (row_1), (row_2),...) B( {col_list} )
During the row serialization, its not possbile to determin the data type of NULL
or None
in python. This makes the job complicated. All NULL
values in timestamp
columns, integer
columns etc need explicit type cast into proper types, or Pg complains about it.
Currently I am checking the DB API connection.description
property and compare column type_code, for every column and add type casting like ::timestamp
as needed.
But this feels cumbersome, with the extra work: the driver already converted the data from text to proper python data type, now I have to redo it for column with those many None
s.
Is there any better way to work around this with elegancy & simplicity ?