i have been working with bulk inserts on our Azure SQL Pool(Formerly SQL DW) with this code:
def save_sql_bulk(self, table_name, schema, file_path):
logging.info(f"Starting proccess to save data to table:{table_name}")
logging.info(f"Connecting to sink database: {sy_server}")
cnxn_sy = pyodbc.connect(sy_cnxn)
logging.info(f"Appending to table: {table_name} with schema: {schema} from path: {file_path}")
query = f'''
COPY INTO {schema}.{table_name}
FROM 'https://siidatalake.blob.core.windows.net/containername/{file_path}'
WITH(
FIRSTROW = 2,
FILE_TYPE = 'CSV',
MAXERRORS = 100,
FIELDTERMINATOR = ';',
CREDENTIAL = (IDENTITY= 'MANAGED IDENTITY')
)'''
logging.info(f"Executing query: \n {query} \n")
print(f"Executing query: \n {query} \n")
cnxn_sy.execute(query)
cnxn_sy.commit()
cnxn_sy.close()
logging.info(f"Done appending without problems.. ")
All in all, i have no problems whatsoever in most cases, but from time to time i encounter the following problem while executing such statement:
Exception while proccesing [file_path] with error [('42000', "[42000] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]111213;The current transaction has aborted, and any pending changes have been rolled back. Cause: 111208;An internal error occurred that prevents further processing of this command: 'Object reference not set to an instance of an object.'. (111213) (SQLExecDirectW)")]
Sometimes even happens with a proccess that was already working for a while, what could this mean? my table definitions have not changed and the proccess fails even with data that was already proccesed before, for more context im inserting a couple of millions of rows. I read in another thread that it could be database related but in my case my databases are maintained by Azure so i don't think is a problem that will be fixed with a database patch or reinstall, any ideas?
Thanks.