1

In my asp.net core application I am using Hangfire with MySQL database storage. I have an endpoint which when accessed schedules a hangfire job on the background. When I do load testing to this endpoint, if I send more than 40 concurrent requests, this code BackgroundJob.Schedule<IJobSchedulerCallbacks>(s => s.ScheduleSomeCode(); starts throwing the following exception:

Hangfire.BackgroundJobClientException: Background job creation failed. See inner exception for details. ---> MySql.Data.MySqlClient.MySqlException: Deadlock found when trying to get lock; try restarting transaction
at MySql.Data.MySqlClient.MySqlStream.ReadPacket()
at MySql.Data.MySqlClient.NativeDriver.GetResult(Int32& affectedRow, Int64& insertedId)
at MySql.Data.MySqlClient.Driver.NextResult(Int32 statementId, Boolean force)
at MySql.Data.MySqlClient.MySqlDataReader.NextResult()
at MySql.Data.MySqlClient.MySqlCommand.ExecuteReader(CommandBehavior behavior)
at MySql.Data.MySqlClient.MySqlCommand.ExecuteNonQuery()
at Dapper.SqlMapper.ExecuteCommand(IDbConnection cnn, CommandDefinition& command, Action`2 paramReader)
at Dapper.SqlMapper.ExecuteImpl(IDbConnection cnn, CommandDefinition& command)
at Dapper.SqlMapper.Execute(IDbConnection cnn, String sql, Object param, IDbTransaction transaction, Nullable`1 commandTimeout, Nullable`1 commandType)
at Hangfire.MySql.MySqlWriteOnlyTransaction.<>c__DisplayClass14_0.<AddToSet>b__0(MySqlConnection x)
at Hangfire.MySql.MySqlWriteOnlyTransaction.<Commit>b__29_0(MySqlConnection connection)
at Hangfire.MySql.MySqlStorage.<>c__DisplayClass18_0.<UseTransaction>b__0(MySqlConnection connection)
at Hangfire.MySql.MySqlStorage.UseConnection[T](Func`2 func)
at Hangfire.MySql.MySqlStorage.UseTransaction[T](Func`2 func, Nullable`1 isolationLevel)
at Hangfire.MySql.MySqlStorage.UseTransaction(Action`1 action)
at Hangfire.MySql.MySqlWriteOnlyTransaction.Commit()
at Hangfire.Client.CoreBackgroundJobFactory.Create(CreateContext context)
at Hangfire.Client.BackgroundJobFactory.<>c__DisplayClass7_0.<CreateWithFilters>b__0()
at Hangfire.Client.BackgroundJobFactory.InvokeClientFilter(IClientFilter filter, CreatingContext preContext, Func`1 continuation)
at Hangfire.Client.BackgroundJobFactory.Create(CreateContext context)
at Hangfire.BackgroundJobClient.Create(Job job, IState state)
--- End of inner exception stack trace ---
at Hangfire.BackgroundJobClient.Create(Job job, IState state)
at Hangfire.BackgroundJobClientExtensions.Schedule[T](IBackgroundJobClient client, Expression`1 methodCall, TimeSpan delay)
at Hangfire.BackgroundJob.Schedule[T](Expression`1 methodCall, TimeSpan delay)

When I inspect the innodb log by using this command: SHOW ENGINE INNODB STATUS I get the following logs:

=====================================
2018-12-19 14:37:29 0x2ab9c5591700 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 53 seconds
-----------------
BACKGROUND THREAD
-----------------
srv_master_thread loops: 2441 srv_active, 0 srv_shutdown, 13392 srv_idle
srv_master_thread log flush and writes: 15830
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 7531
OS WAIT ARRAY INFO: signal count 8029
RW-shared spins 0, rounds 15152, OS waits 6763
RW-excl spins 0, rounds 15133, OS waits 270
RW-sx spins 58, rounds 1734, OS waits 37
Spin rounds per wait: 15152.00 RW-shared, 15133.00 RW-excl, 29.90 RW-sx
------------------------
LATEST DETECTED DEADLOCK
------------------------
2018-12-19 13:41:01 0x2aba11f50700
*** (1) TRANSACTION:
TRANSACTION 88410, ACTIVE 0 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 1136, 2 row lock(s), undo log entries 1
MySQL thread id 443, OS thread handle 46979012679424, query id 374494 172.31.25.222 cpdbuser update
INSERT INTO `Set` (`Key`, `Value`, `Score`) VALUES (''schedule'', ''475'', 1545313257) ON DUPLICATE KEY UPDATE `Score` = 1545313257
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 147 page no 4 n bits 176 index IX_Set_Key_Value of table `cp-hangfire`.`Set` trx id 88410 lock_mode X locks gap before rec insert intention waiting
Record lock, heap no 103 PHYSICAL RECORD: n_fields 3; compact format; info bits 0
 0: len 8; hex 7363686564756c65; asc schedule;;
 1: len 3; hex 343736; asc 476;;
 2: len 4; hex 80000088; asc     ;;

*** (2) TRANSACTION:
TRANSACTION 88408, ACTIVE 0 sec inserting
mysql tables in use 1, locked 1
4 lock struct(s), heap size 1136, 3 row lock(s), undo log entries 1
MySQL thread id 457, OS thread handle 46978653554432, query id 374490 172.31.25.222 cpdbuser update
INSERT INTO `Set` (`Key`, `Value`, `Score`) VALUES (''schedule'', ''474'', 1545313257) ON DUPLICATE KEY UPDATE `Score` = 1545313257
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 147 page no 4 n bits 176 index IX_Set_Key_Value of table `cp-hangfire`.`Set` trx id 88408 lock_mode X locks gap before rec
Record lock, heap no 103 PHYSICAL RECORD: n_fields 3; compact format; info bits 0
 0: len 8; hex 7363686564756c65; asc schedule;;
 1: len 3; hex 343736; asc 476;;
 2: len 4; hex 80000088; asc     ;;

*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 147 page no 4 n bits 176 index IX_Set_Key_Value of table `cp-hangfire`.`Set` trx id 88408 lock_mode X locks gap before rec insert intention waiting
Record lock, heap no 103 PHYSICAL RECORD: n_fields 3; compact format; info bits 0
 0: len 8; hex 7363686564756c65; asc schedule;;
 1: len 3; hex 343736; asc 476;;
 2: len 4; hex 80000088; asc     ;;

*** WE ROLL BACK TRANSACTION (1)

Note that these two very simple transactions with only one insert command have created the deadlock:

INSERT INTO `Set` (`Key`, `Value`, `Score`) VALUES (''schedule'', ''475'', 1545313257) ON DUPLICATE KEY UPDATE `Score` = 1545313257
INSERT INTO `Set` (`Key`, `Value`, `Score`) VALUES (''schedule'', ''474'', 1545313257) ON DUPLICATE KEY UPDATE `Score` = 1545313257

Here is the Set table schema: Set table schema And here is the Set table unique index on Value and Score columns: Indexes

I found this stackoverflow answer that says it is expected mysql innodb to make deadlocks even on perfectly normal situations which I find weird. Anyway as a solution I tried implementing an exponential back-off retry policy by using Polly, which is a great library. But this only postponed the errors because now the code for scheduling jobs gets retried and after the 3th retry the client connection is simply dropped due to the 30 seconds nginx response timeout.

First question: Why MySQL starts deadlocking when this simple command of scheduling a job is executed concurrently?

Second question If it is really true that innodb is expected to create deadlocks even in normal circumstances then how is MySql expected to be used in any production database which is expected to have more concurrent users ? Am I missing something ?

(from comment)

CREATE TABLE `Set` (
    `Id` int(11) NOT NULL AUTO_INCREMENT, 
    `Key` varchar(100) NOT NULL, 
    `Value` varchar(256) NOT NULL, 
    `Score` double DEFAULT NULL, 
    `ExpireAt` datetime DEFAULT NULL, 
    PRIMARY KEY (`Id`), 
    UNIQUE KEY `IX_Set_Key_Value` (`Key`,`Value`)
) ENGINE=InnoDB AUTO_INCREMENT=143 DEFAULT CHARSET=latin1
Rick James
  • 135,179
  • 13
  • 127
  • 222
Dejan Bogatinovski
  • 610
  • 1
  • 6
  • 21
  • Please provide `SHOW CREATE TABLE`. Please provide the SQL for the entire transaction (BEGIN thru COMMIT). – Rick James Dec 19 '18 at 17:00
  • 1
    "Key-value" schemas, in general, have lots of troubles. I think you have found another reason to eschew them. – Rick James Dec 19 '18 at 17:04
  • Here is the result from SHOW CREATE TABLE ``'Set', 'CREATE TABLE `Set` (\n `Id` int(11) NOT NULL AUTO_INCREMENT,\n `Key` varchar(100) NOT NULL,\n `Value` varchar(256) NOT NULL,\n `Score` double DEFAULT NULL,\n `ExpireAt` datetime DEFAULT NULL,\n PRIMARY KEY (`Id`),\n UNIQUE KEY `IX_Set_Key_Value` (`Key`,`Value`)\n) ENGINE=InnoDB AUTO_INCREMENT=143 DEFAULT CHARSET=latin1'`` – Dejan Bogatinovski Dec 19 '18 at 17:37
  • I don't know how to find the entire transaction statements – Dejan Bogatinovski Dec 19 '18 at 17:44
  • But Hangfire is famous background-job library for .net and I am supposing it shouldn't throw deadlocks on the most essential operation of the library which is to schedule a job. I should be able to schedule 100s of jobs concurrently without deadlock fails. – Dejan Bogatinovski Dec 19 '18 at 17:54
  • Unless you need `id` for something else, get rid of it, and promote `UNIQUE(key,value)` to be `PRIMARY KEY`. – Rick James Dec 19 '18 at 18:51
  • Thank you for the suggestion but I am not able to do that because this table is part of the Hangfire library which I use. I think I should post this on their github issues page. – Dejan Bogatinovski Dec 20 '18 at 12:07

1 Answers1

1

First question : I don't know Hangfire, but it is unlikely that it only run a single insert query in CoreBackgroundJobFactory.Create. It may at lease perform select on another table which can be locked itself and combination on these 2 processes can lock themselves.

Second question : Innodb Locking strategy depends on transaction isolation level If you are running high concurrency environment, you can lower the isolation level : it will decrease deadlock probability. However some ACID side-effect could appear, even if in my personal experience I didn't encouter any even with READ_UNCOMMITED. You can try to add this to Hangfire datasource configuration and see what happens

Benjamin Caure
  • 2,090
  • 20
  • 27
  • I will try with `READ_COMMITTED` and see what happens. Although I think I should post this to hangfire's github issues page and see what they say. – Dejan Bogatinovski Dec 20 '18 at 12:08