I have a problem, where I see deadlocks on my Azure SQL Database. These occur since we changed the code from a virtual machine into Airflow. Which means that the most obvious changes are:
- Windows Server 2022 Datacenter Azure Edition to Kubernetes 1.25.6
- Version changes of SQLAlchemy and/or pandas.
In df.to_sql
, it is always an if_exists='replace'
, then it always crashes on check_case_sensitive
where it deadlocks on sys.tables
. If it was on an actual data table I would understand it is something like a transaction commitment issue, if I checked the sys.tables myself I would maybe understand it, but the error is raised deep within the module. The case check is just a check to raise a warning "The provided table name '{name}' is not found exactly as such in the database after writing the table, possibly due to case sensitivity issues. Consider using lower case table names." without even fixing it.
select [INFORMATION_SCHEMA].[TABLES].[TABLE_NAME]
from [INFORMATION_SCHEMA].[TABLES]
where [INFORMATION_SCHEMA].[TABLES].[TABLE_SCHEMA] = CAST('dbo' as nvarchar(max))
and [INFORMATION_SCHEMA].[TABLES].[TABLE_TYPE] = CAST('BASE TABLE' as nvarchar(max))
order by [INFORMATION_SCHEMA].[TABLES].[TABLE_NAME]
We have no full transaction log on the SQL DB. What possibilities do I have to find the problem? I'm lost for ideas.
Traceback:
Traceback (most recent call last):
File "/opt/airflow/dags/repo/dags/xyz.py", line 308, in xyzabc
df.to_sql(tablename, con=engine, if_exists=if_exists, dtype=datatypedict, index=False, chunksize=10000)
File "/home/airflow/.local/lib/python3.10/site-packages/pandas/core/generic.py", line 2878, in to_sql
return sql.to_sql(
File "/home/airflow/.local/lib/python3.10/site-packages/pandas/io/sql.py", line 769, in to_sql
return pandas_sql.to_sql(
File "/home/airflow/.local/lib/python3.10/site-packages/pandas/io/sql.py", line 1932, in to_sql
self.check_case_sensitive(name=name, schema=schema)
File "/home/airflow/.local/lib/python3.10/site-packages/pandas/io/sql.py", line 1832, in check_case_sensitive
table_names = insp.get_table_names(schema=schema or self.meta.schema)
File "/home/airflow/.local/lib/python3.10/site-packages/sqlalchemy/engine/reflection.py", line 266, in get_table_names
return self.dialect.get_table_names(
File "<string>", line 2, in get_table_names
File "/home/airflow/.local/lib/python3.10/site-packages/sqlalchemy/engine/reflection.py", line 55, in cache
ret = fn(self, con, *args, **kw)
File "/home/airflow/.local/lib/python3.10/site-packages/sqlalchemy/dialects/mssql/base.py", line 2613, in wrap
return _switch_db(
File "/home/airflow/.local/lib/python3.10/site-packages/sqlalchemy/dialects/mssql/base.py", line 2655, in _switch_db
return fn(*arg, **kw)
File "/home/airflow/.local/lib/python3.10/site-packages/sqlalchemy/dialects/mssql/base.py", line 3065, in get_table_names
table_names = [r[0] for r in connection.execute(s)]
File "/home/airflow/.local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1385, in execute
return meth(self, multiparams, params, _EMPTY_EXECUTION_OPTS)
File "/home/airflow/.local/lib/python3.10/site-packages/sqlalchemy/sql/elements.py", line 334, in _execute_on_connection
return connection._execute_clauseelement(
File "/home/airflow/.local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1577, in _execute_clauseelement
ret = self._execute_context(
File "/home/airflow/.local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1948, in _execute_context
self._handle_dbapi_exception(
File "/home/airflow/.local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 2129, in _handle_dbapi_exception
util.raise_(
File "/home/airflow/.local/lib/python3.10/site-packages/sqlalchemy/util/compat.py", line 211, in raise_
raise exception
File "/home/airflow/.local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1905, in _execute_context
self.dialect.do_execute(
File "/home/airflow/.local/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 736, in do_execute
cursor.execute(statement, parameters)
sqlalchemy.exc.DBAPIError: (pyodbc.Error) ('40001', '[40001] [Microsoft][ODBC Driver 18 for SQL Server][SQL Server]Transaction (Process ID 418) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction. (1205) (SQLExecDirectW)')
[SQL: SELECT [INFORMATION_SCHEMA].[TABLES].[TABLE_NAME]
FROM [INFORMATION_SCHEMA].[TABLES]
WHERE [INFORMATION_SCHEMA].[TABLES].[TABLE_SCHEMA] = CAST(? AS NVARCHAR(max)) AND [INFORMATION_SCHEMA].[TABLES].[TABLE_TYPE] = CAST(? AS NVARCHAR(max)) ORDER BY [INFORMATION_SCHEMA].[TABLES].[TABLE_NAME]]
[parameters: ('dbo', 'BASE TABLE')]
(Background on this error at: https://sqlalche.me/e/14/dbapi)