Situation:
we have a master-master-replication using GTIDs on Percona MySQL 5.6.32-78.1. On server, there are about 10 databases and we've set slave_parallel_workers=5
. One server is used for frontend handling and one for backend. Two or three times a week, the replication on the backend server dies with error
2016-10-25 10:00:01 165238 [Warning] Slave SQL: Worker 4 failed executing transaction '0e7b97a8-a689-11e5-8b79-901b0e8b0f53:22506262' at master log mysql-bin.011888, end_log_pos 9306420; Could not execute Update_rows event on table shop.sessions; Deadlock found when trying to get lock; try restarting transaction, Error_code: 1213; handler error HA_ERR_LOCK_DEADLOCK; the event's master log mysql-bin.011888, end_log_pos 9306420, Error_code: 1213
2016-10-25 10:00:01 165238 [ERROR] Slave SQL: ... The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state. A restart should restore consistency automatically, although using non-transactional storage for data or info tables or DDL queries could lead to problems. In such cases you have to examine your data (see documentation for details). Error_code: 1756
2016-10-25 10:00:01 165238 [Note] Error reading relay log event: slave SQL thread was killed
What could be the reason? There are no cross-database DML statements and I thought by using MTS, only one thread is used per database (the benefit of MTS is using parallel replication across several databases)? Why does a repliation break with a deadlock?
EDIT 2016-10-28:
Schema of table looks like
CREATE TABLE `sessions` (
`id` int(11) NOT NULL,
`session_id` char(40) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
`crypt_iv` blob NOT NULL,
`data` mediumblob NOT NULL,
`user_id` int(11) NOT NULL,
`last_refresh` datetime NOT NULL,
`timeout` datetime NOT NULL,
`closed` tinyint(4) NOT NULL,
`inserted` datetime NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
ALTER TABLE `sessions`
ADD PRIMARY KEY (`id`),
ADD UNIQUE KEY `session_id` (`session_id`),
ADD KEY `user_id` (`user_id`),
ADD KEY `timeout` (`timeout`);
ALTER TABLE `sessions` MODIFY `id` int(11) NOT NULL AUTO_INCREMENT;
At time this error has only happened on backend side, never on frontend server. At the moment I cannot paste the exact statement as binary logs are purged. But the only statement inside this GTID transaction is a row-based UPDATE on the table.