3

I have a multi-threaded client/server system with thousands of clients continuously sending data to the server that is stored in a specific table. This data is only important for a few days, so it's deleted afterwards.

The server is written in J2SE, database is MySQL and my table uses InnoDB engine. It contains some millions of entries (and is indexed properly for the usage).

One scheduled thread is running once a day to delete old entries. This thread could take a large amount of time for deleting, because the number of rows to delete could be very large (some millions of rows). On my specific system deletion of 2.5 million rows would take about 3 minutes.

The inserting threads (and reading threads) get a timeout error telling me

Lock wait timeout exceeded; try restarting transaction

  1. How can I simply get that state from my Java code? I would prefer handling the situation on my own instead of waiting. But the more important point is, how to prevent that situation?
  2. Could I use

    conn.setIsolationLevel( Connection.TRANSACTION_READ_UNCOMMITTED )
    

    for the reading threads, so they will get their information regardless if it is most currently accurate (which is absolutely OK for this usecase)?

  3. What can I do to my inserting threads to prevent blocking? They purely insert data into the table (primary key is the tuple userid, servertimemillis).
  4. Should I change my deletion thread? It is purely deleting data for the tuple userid, greater than specialtimestamp.

Edit:

When reading the MySQL documentation, I wonder if I cannot simply define the connection for inserting and deleting rows with

conn.setIsolationLevel( Connection.TRANSACTION_READ_COMMITTED )

and achieve what I need. It says that UPDATE- and DELETE statements, that use a unique index with a unique search pattern only lock the matching index entry, but not the gap before and with that, rows can still be inserted into that gap. It would be great to get your experience on that, since I can't simply try it on production - and it is a big effort to simulate it on test environment.

Davis Herring
  • 36,443
  • 4
  • 48
  • 76
Schlangi
  • 1,135
  • 1
  • 9
  • 19

2 Answers2

2

Try in your deletion thread to first load the IDs of the records to be deleted and then delete one at a time, committing after each delete.

If you run the thread that does the huge delete once a day and it takes 3 minutes, you can split it to smaller transactions that delete a small number of records, and still manage to get it done fast enough.

A better solution :

First of all. Any solution you try must be tested prior to deployment in production. Especially a solution suggested by some random person on some random web site.

Now, here's the solution I suggest (making some assumptions regarding your table structure and indices, since you didn't specify them):

  1. Alter your table. It's not recommended to have a primary key of multiple columns in InnoDB, especially in large tables (since the primary key is included automatically in any other indices). See the answer to this question for more reasons. You should add some unique RecordID column as primary key (I'd recommend a long identifier, or BIGINT in MySQL).

  2. Select the rows for deletion - execute "SELECT RecordID FROM YourTable where ServerTimeMillis < ?".

  3. Commit (to release the lock on the ServerTimeMillis index, which I assume you have, quickly)

  4. For each RecordID, execute "DELETE FROM YourTable WHERE RecordID = ?"

  5. Commit after each record or after each X records (I'm not sure whether that would make much difference). Perhaps even one Commit at the end of the DELETE commands will suffice, since with my suggested new logic, only the deleted rows should be locked.

As for changing the isolation level. I don't think you have to do it. I can't suggest whether you can do it or not, since I don't know the logic of your server, and how it will be affected by such a change.

Community
  • 1
  • 1
Eran
  • 387,369
  • 54
  • 702
  • 768
  • I'm not convinced that this would be a good idea. Assuming that 2.5 million rows shall be deleted, that would take around a full hour if each statement would last only 0.0015 seconds. On the other hand, the lock problem would be solved... – Schlangi Nov 26 '13 at 13:55
  • +1 for the detailed explanation, and I would give another +1 for `First of all. Any solution you try must be tested prior to deployment in production. Especially a solution suggested by some random person on some random web site`. I did not want to let you think I will not or don't want to test... ;) – Schlangi Nov 26 '13 at 20:07
  • As I understand the MySQL InnoDB documentation, a delete statement even on one row will lock the surrounding previously deleted rows - they call it "gaps" at the MySQL website. So commit after all deletions at once could result in a similar lock situation, but that I think I have to try on my own. The combined index will not be any problem at that specific case, because it is really the logical primary key and the only thing what is used to access the data. – Schlangi Nov 26 '13 at 20:14
1

You can try to replace your one huge DELETE with multiple shorter DELETE ... LIMIT n with n being determined after testing (not too small to cause many queries and not too large to cause long locks). Since the locks would last for a few ms (or seconds, depending on your n) you could let the delete thread run continuously (provided it can keep-up; again n can be adjusted so it can keep-up). Also, table partitioning can help.

Tasos P.
  • 3,994
  • 2
  • 21
  • 41
  • Yes, that would do a much better job than it is now, and I could be sure that I don't lock the other threads too long. I take this into account for a quick workaround, but as you already noticed on your own, it could somewhere get tricky if the amount of rows to delete gets too big to be deleted in `x * n` timeframe before the next turn has to start... What about my idea with `conn.setIsolationLevel( Connection.TRANSACTION_READ_COMMITTED )`? – Schlangi Nov 26 '13 at 16:02
  • I tried that on a backup db and it works much faster than single deletion statements. But it has the same inherent problem with locking, even if this of cause is for short time then. – Schlangi Nov 26 '13 at 20:21
  • I mark your solution as accepted, although I still think about changes to the isolation level... – Schlangi Nov 27 '13 at 16:58