1

I think that's a common problem but I haven't found any solution, perhaps I'm not searching for the problem correctly in google. In summary, I have a process that inserts multiple rows in a table (among a lot of other things in the same transaction), but the process is executed in multiple threads and multiple servers.

TABLE: COVERAGES
COLUMNS: COV_Id, COV_Description

The description is Unique, but not as a constraint in database (legacy), and I want to avoid to insert duplicate descriptions. I've isolated the search and insert in a independent transaction and I want to lock the table before select and liberate it after "Save" if it not exists.

I would something like that (high level):

{
    this.Lock(Coverage); // Lock table Coverages to avoid select out of this transaction
    Coverage coverage = session.QueryOver<Coverage>().Where(g => g.Description == description).Take(1).SingleOrDefault();
    if (coverage == null)
    {
        this.Save(new Coverage { Description = description });
    }
    return coverage;
};

I can't use lock instruction of C# because the process is executed in multiple servers, and I can't use Lock instruction of NHibernate, because precisely I want to block when I've no results.

I'm using NHibernate 3.3 for SqlServer and Oracle.

Marc
  • 1,359
  • 1
  • 15
  • 39

2 Answers2

3

I finally implement a semaphore on database to solve the problem. As I mention on my above "discussion" with Frédéric, I need to lock the thread at select to avoid duplicate insertion, the Serializable isolation level, locks on INSERT and throws deadlock exceptions when the insert is called in concurrent calls on SQL Server. By other way on Oracle throws error 08177. 00000 - "can't serialize access for this transaction", or keeps waiting for the end of the other transaction inserting the value duplicated later (see sample sql's below).

So the solution is something like this:

public Coverage CreateCoverageSessionIsolated(string description, out bool isNew)
{
    Coverage coverage = null;
    bool _isNew = false;
    this.ExecuteOnNewSession((session) =>
    {
        this.semphoresDao.LockSemaphore(session, "SMF_COVERAGES");
        coverage = session.QueryOver<Coverage>()
            .Where(g => g.Description == description)
            .Take(1) 
            .SingleOrDefault();
        _isNew = coverage == null;
        if (coverage == null)
        {
            coverage = new Coverage { Description = description };
            this.Save(coverage);
        }
    });
    isNew = _isNew;
    return coverage;
}

I adapt the real code a bit to better comprension.

  • ExecuteOnNewSession, starts a new isolated ReadCommitted transaction. So it doesn't interferes with opened transaction, to avoid timeouts on uncontrolled locks and deadlocks and reduces the risk time.
  • LockSempahore: Executes a select query, locking specific row.

I've tried it and works fine on SQL Server and Oracle.

EDIT: To check that the solution of Serializable transaction doesn't fit for me I use that simple SQL code on two concurrent transactions executing step by step, side by side:

BEGIN TRAN; -- ONLY FOR SQL
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
SELECT COV_ID FROM COVERAGES WHERE COV_DESCRIPTION = 'testcov';
INSERT INTO COVERAGES (COV_DESCRIPTION) VALUES ('testcov');
COMMIT;
Marc
  • 1,359
  • 1
  • 15
  • 39
  • Your answer introduction seems to implies you can have duplicates with my answer. I would love to see a [mcve] of it. I think it is wrong, you cannot have duplicates with my answer when done right (and even without any unique constraint). I have really thoroughly tested it, used in production since 2009 at least, having more than two hundred concurrent users at peak hours trying to grab the same limited supply protected by my answer logic, and never had an issue at least till I left in 2014. I have added some more explanations in my [answer](/a/43686178/1178314). – Frédéric May 03 '17 at 12:05
  • Sorry, you've reason, I had tested various solutions before to post the question, and I've written the post after by heart. I've tested now in a more controlled environment, and I've verified that the problem of your solution is, as you've said, the deadlock, not the duplicated. But as I point in a previous comment, I want to avoid manage deadlock exception. I fix the answer to fix that introduction. – Marc May 03 '17 at 15:42
  • In Oracle, the test didn't throw deadlock, depending on the order of the calls, it throws: 08177. 00000 - "can't serialize access for this transaction" or keeps waiting and when the other thread commits, inserts the value duplicated (if no unique constraint). I've tried with that simple instructions in SQL Developer and SQL Management Studio: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE; SELECT COV_ID FROM COVERAGES WHERE COV_DESCRIPTION = 'testcov'; INSERT INTO COVERAGES (COV_DESCRIPTION) VALUES ('testcov'); COMMIT; – Marc May 03 '17 at 16:01
  • I cannot speak about the Oracle case, my experience with Oracle is near 0. Maybe a specific question about `Serializable` with Oracle may explain what is going on. It sounds strange that this transaction level fails to achieve what it is meant for on a major rdbms. Now if dealing with deadlock is a no go for you, sorting this case out would be just for curiosity. – Frédéric May 03 '17 at 17:01
2

You may lock your table with a transaction using IsolationLevel.Serializable.

using (var t = session.BeginTransaction(IsolationLevel.Serializable))
{
    var coverage = session.QueryOver<Coverage>()
        .Where(g => g.Description == description)
        .Take(1).SingleOrDefault();
    if (coverage == null)
    {
        coverage = new Coverage { Description = description };
        session.Save(coverage);
    }
    t.Commit();
    return coverage;
}

For limiting lock contention, this requires that your table has an index on Description and that this index is actually used by the reading query. Otherwise, it will lock the entire table instead of just locking "nearby" Description values. Read more here. For official documentation a bit more complete than .Net Framework one, read here and here.

In case two or more processes (or threads) attempt to perform concurrent conflicting inserts1, they will all encounter a deadlock. All of them excepted one will be rollbacked as deadlocks victims. The remaining single one will carry on.

The deadlock occurs on the insert, not on the select. All the processes will hang on the insert, all ending rollbacked, excepted one. This ensures no duplicates will be inserted.

This means the complete code for handling that is a bit more elaborated.

while (true)
{
    using (var session = sessFactory.OpenSession())
    {
        try
        {
            using (var t = session.BeginTransaction(IsolationLevel.Serializable))
            {
                var coverage = session.QueryOver<Coverage>()
                    .Where(g => g.Description == description)
                    .Take(1).SingleOrDefault();
                if (coverage == null)
                {
                    coverage = new Coverage { Description = description };
                    session.Save(coverage);
                }
                t.Commit();
                // Breaks the loop by the way.
                return coverage;
            }
        }
        catch (GenericADOException ex)
        {
            // SQL-Server specific code for identifying deadlocks
            var sqlEx = ex.InnerException as SqlException;
            if (sqlEx == null || sqlEx.Number != 1205)
                throw;
            // Deadlock, just try again by letting the loop go on (eventually
            // log it).
        }
    }
}

Note:
1. Conflicting according to the range locked by the DB, not only about the actual value to insert. Having an adequate index is highly recommended for reducing this range. Not having one may cause the whole table to be locked, resulting in a very poor ability to concurrently insert different values.

Frédéric
  • 9,364
  • 3
  • 62
  • 112
  • I had tried that option, but it didn't work, it's not an option, the index is not locked if I've no results, isn't it? – Marc May 02 '17 at 06:17
  • That is very strange. `Serializable` is specifically meant for "locking non existent row", by locking index ranges including them. If no index range can include them, it falls back to locking the entire table. I have already used it that way, and checked I was blocked from inserting such a row from a concurrent connection: that was working (with SQL Server 2005 then 2008 R2, on an `int` column index). Of course, sometime that was causing deadlocks, and my code was having logic for recovering from them. – Frédéric May 02 '17 at 09:12
  • If you expect to be blocked on read, that will not happen. You are blocked on the insert, and you have to handle deadlocks. (Recovering from them by throwing away the session (dispose it), taking a new one and try again.) – Frédéric May 02 '17 at 09:17
  • If you fear that one transaction will wait for another one to finish its job then insert a duplicate, that will not happen (provided your range locking select has used the same index for both transaction). Both transaction will instead encounter a deadlock, one of them will be chosen as victim and completely cancelled, while the other will carry on its job. – Frédéric May 02 '17 at 09:22
  • Serializable locks when you have results and, as you says, I want to lock in select to avoid handle with duplicates. I'm working yet with Oracle, but I've tried with SqlServer too. Now I'm trying with a "sempahore" on database (lock a specific register in another table), if it works as I expect, I'll publish the solution later. – Marc May 02 '17 at 13:16
  • I insist, `Serializable` is meant for locking even when there are no results. I do not know how you test, but all the tests I do just work (block on insert), even on a completely empty table. (Now tested with SQL Server 2016.) – Frédéric May 02 '17 at 14:36
  • And as I have already said, do not expect to be blocked on select, that does block on insert, and will cancel the concurrent transaction if any with a deadlock (so no dups guaranteed, but an exception to handle). If you want to block on select instead, `Serializable` will not do. Your idea could. – Frédéric May 02 '17 at 14:51
  • Yes, you're right, it blocks on insert, and in case of duplicate or deadlock the transaction is rollbacked, right, but it isn't what I need. I don't want to use "exceptions" to solve a "business" issue. Thanks a lot for your details and responses, but I'll use the "semaphore" idea to lock before to query if register exists. – Marc May 03 '17 at 06:37
  • @Marc "_the process is executed in multiple threads and multiple servers_" then the `Serializable` approach should be your choice. A `Semaphore` works only in one system and not across multiple servers. – Rabban May 03 '17 at 07:07
  • Yes I avoid exceptions for handling business usual cases too, but sometimes their are just the way to go. If your semaphore in DB can do it while not overly complicating code, why not. Otherwise, maybe reconsider. – Frédéric May 03 '17 at 08:31
  • I have added the deadlock handling in my answer. Testing it may be a bit cumbersome. Within a web site, put a break-point in VS after the select, launch two tabs on page trying the same insert, leaving them waiting on the break-point, pop-up the thread windows in VS, freeze the currently breaked thread, F5 for go-on debugging and wait for the other page call, then thaw the frozen thread and let all of them go. – Frédéric May 03 '17 at 09:53
  • @Rabban, is a database semaphore, so it's farm/garden safe (only one database server), watch my response below, how about? – Marc May 03 '17 at 11:08