1

I have a 2 threaded process accessing data from a MySQL table. As it is possible with MySQL, I'm getting deadlocked sometimes.

Now, what I've in place is: in case of deadlock,just update all the rows back to their previous state and then restart the process.

But I suspect, something is going wrong as in one case I wanted to update 1300-ish data and after deadlock the count showed 1700-ish.

So, some of the data were processed twice possibly by calling the 2nd instance of the process.

Should I kill all the running threads before restarting the process?

EDIT:

I am using InnoDB with index created on id. The threads are supposed to get whatever rows in limit 500 with Processed flag 'N' is there and process them then update the individual rows to 'Y'.

I also have a process id updated to indicate which thread should act on what rows.

Still I'm getting deadlocked on resultset.updaterow()

Table:

create table Mail(
id integer not null auto_increment primary key,
from_eaddress varchar(50) not null,
to_address varchar(50) not null,
process_flg varchar(1) not null,
process_id varchar(50) default null,
time_stamp datetime default null,
index(id)
) engine=InnoDB;

finally here is my working code: https://github.com/codezero10/MailQueue/tree/master

codezero
  • 53
  • 6

3 Answers3

1

It seems to be a transactional problem. You must use the necessary logic in your program/controller BEFORE accessing to the information. If you try to access to a row that is in use, you always (not sometimes) will see this error. It is produced sometimes because both threads could not access at the same time but this is an incorrect behavior

Note: Modern RDBMS manage its memory by their own. Thus, you need to implement a prevention algorithm such as Mutual exclusión.

crm86
  • 1,394
  • 1
  • 21
  • 44
0

I was experiencing a mysql deadlocking issue outlined here: What are 'best practices' for dealing with transient database errors in a highly concurrent Rails system?

What I ultimately found is explained in my own answer here.

In short, the tables were not indexed optimally, which meant queries ran much slower that they could have (and long running queries make deadlock more likely). Adding the proper index completely removed the deadlocking issue.

Community
  • 1
  • 1
Mark Bolusmjak
  • 23,606
  • 10
  • 74
  • 129
  • i have id as my primary key and index is set on that id. I have added the the table structure in the question. – codezero Apr 21 '15 at 16:41
  • What do your queries that update look like? Do they only query on ID? – Mark Bolusmjak Apr 21 '15 at 16:53
  • This what I'm using. I'm supplying the timestamp and process id to match. SELECT * FROM Mail where time_stamp=? and process_flg='N' and process_id=? limit 500 for update – codezero Apr 21 '15 at 17:00
  • So the only index you have is on ID and you are querying by other columns. So the index is not used and the entire table will be scanned. This is as inefficient as possible. You should either have an ID on one of the columns in the query, or a composite index of several columns. (depends on the nature of your data which only you know). – Mark Bolusmjak Apr 21 '15 at 17:16
  • ok. so here i'm querying on time_stamp,process_flg and process_id. so,the index should be any one or a mixture of them?? but, all 3 columns have repeats of data. For example, time_stamp would be same for 500 rows. – codezero Apr 21 '15 at 17:29
  • An index need not be unique. Enforcing uniqueness of a column is optional. You should probably do some reading on indexes. In short, mysql can only use 1 index (composite or not) per table per query. Using an index to find data is fast. If mysql can't find an appropriate index, it needs to scan the table and that's slow. Use MySQL's 'explain' feature to get some insight as to the effectiveness of your queries. There are many tradeoffs. Only someone with access to ALL of your data and queries can produce an optimal set of indexes. Stackoverflow can't do it for you. – Mark Bolusmjak Apr 21 '15 at 17:40
  • If you have a composite index on columns X,Y,Z, then a query using X, or X and Y, or X and Y and Z can use that index. Ordering indexes with high selectivity first will help. The reason we don't index everything is because there is a cost associated with updating indexes during insert/update operations. That's why it's all about trade-offs. – Mark Bolusmjak Apr 21 '15 at 17:46
  • Thanks, I actually modified to have an update that uses index only. but still having deadlock issues.(less frequent though) this is my project: https://github.com/codezero10/MailQueue/commit/107ae447351a1676ced1c28bf3c488b84d31c5b2 May be I'm asking a lot, but if you have time can you help me out a bit? – codezero Apr 22 '15 at 14:12
0

The deadlock issue I was facing was transaction related.

what I did was to wrap my Update statement in a transaction block and deadlock vanished.

        conn.setAutoCommit(false);// start transaction
        conn.setTransactionIsolation(Connection.TRANSACTION_SERIALIZABLE); 
        String sql2="update EmailQueue set process_flg='Y' where id="+id;
        //getting deadlocks here
        stmt=conn.createStatement();
        stmt.executeUpdate(sql2);
        conn.commit();// complete transaction
        conn.setAutoCommit(true);
codezero
  • 53
  • 6