17

We are trying to have a transactional table that only takes new records inserted on a regular basis.

This simple table requires us to continuously add new records to it over time. The volume of transactions into this table is expected to be quite high, and also there might be periodical batch imports of transactions (>1000) that may take multiple seconds to complete.

From this data we then do a set of select statements grouping different columns to return the required values.

From our initial testing we have found a bottleneck to be related to SQL Server that blocks our SELECT's when in the middle of a transaction of INSERTS.

Below is a simple example that can be run to illustrate the problem.

-- Simple DB Table

create table LOCK_TEST (
LOCK_TEST_ID int identity ,
AMOUNT int);

-- Run this in 1 query window

begin tran
insert into LOCK_TEST (AMOUNT) values (1);
WAITFOR DELAY '00:00:15' ---- 15 Second Delay
insert into LOCK_TEST (AMOUNT) values (1);
commit

-- In Query 2 run this in parallel

select SUM(AMOUNT)
from LOCK_TEST;

I would expect Query 2 to return straight away, with 0 until query 1 completes, and then show 2. We never want to see 1 returned from the 2nd query.

The answer's we have looked at relate to WITH (NOLOCK) on the select statement. But this violates the transactional boundaries, and the returned information may be financial in nature and we don't wish to see any uncommited details in our queries.

My problem seems to be on the INSERT side...
Why does the INSERT block the SELECT statement even though it's not modifying any existing data?

Bonus points question: Is this a "feature" of SQL Server, or would we find this on other Database flavours also?

UPDATE I have now had time to find a local oracle database and run the same simple test. This test pass's as I would expect.

Ie I can run query as often as I want, and it will return null until the 1st transaction commits, then returns 2.

Is there a way to make SQL Server work like this? or do we need to move to Oracle?

stevemac
  • 2,534
  • 5
  • 27
  • 36
  • what indexes do you have on these tables? do you have regular index maintenance? – Mitch Wheat Apr 27 '09 at 04:41
  • There are no indexes at the moment. See the simple example listed in this post, nothing more than the simple 2 column table, basic insert and select are used to re-create this problem. – stevemac Apr 27 '09 at 06:04

4 Answers4

12

this locking behavior is a feature of SQL Server. With 2005 and above, you can use row level versioning (which is what is used by default on Oracle) to achieve the same result & not block your selects. This puts extra strain on tempdb because tempdb maintains the row level versioning, so make sure you accommodate for this. To make SQL behave the way you want it to, run this:

ALTER DATABASE MyDatabase
SET ALLOW_SNAPSHOT_ISOLATION ON

ALTER DATABASE MyDatabase
SET READ_COMMITTED_SNAPSHOT ON
Nick Kavadias
  • 7,542
  • 2
  • 37
  • 46
4

This is completely standard behaviour in SQL Server and Sybase (at least).

Data changes (Insert, update, delete) require exclusive locks. this causes readers to be blocked:

With an exclusive (X) lock, no other transactions can modify data; read operations can take place only with the use of the NOLOCK hint or read uncommitted isolation level.

With SQL Server 2005 and above, they introduced snapshot isolation (aka row versioning). From MS BOL: Understanding Row Versioning-Based Isolation Levels. This means a reader has latest committed data but if you then want to write back, say, then you may find the data is wrong because it changed in the blocking transaction.

Now, this is why best practice is to keep transactions short. For example, only process what you need to between BEGIN/COMMIT, don't send emails from triggers etc.

Edit:

Locking like this happens all the time in SQL Server. However, locking for too long becomes blocking that reduce performance. Too long is subjective, obviously.

gbn
  • 422,506
  • 82
  • 585
  • 676
  • The transaction is short, but the volume of them cause blocking that shouldn't be required at all. See comment on question about Oracle. – stevemac Apr 27 '09 at 06:05
  • Move to Oracle if you want. If you can move to Oracle simply then I suspect your SQL is not optimal, you're round-tripping from client to server, or something that is causing too much blocking. – gbn Apr 27 '09 at 06:10
  • 2
    How can such a simple example be not optimal? From the simple example provided it still blocks. Can you suggest how I change the provided example to make it optimal? Please note that there will NEVER be updates to this data, only new inserts. – stevemac Apr 27 '09 at 06:13
  • 2
    Also, you can use snapshot isolation in SQL Server is you want. This mimics Oracle behaviour. – gbn Apr 27 '09 at 06:14
  • Thanks will look into it. The concept of "snapshot" in a database scares me a bit. But I'll see what it has to offer. – stevemac Apr 27 '09 at 06:15
  • Your SUM needs to scan the whole table. It can't, because of the block. In a larger example, you'd probably filter to get a SUM which will not require reading of locked rows. That's how we do it over a table that has around 5 million rows per day added. – gbn Apr 27 '09 at 06:16
1

I also ran into this issue. After investigation, it seemed to be that it was not the data that was locked per se, but the clustered index that was being modified as a result of the insert. Since the clustered index resides on the data pages, the associated data rows are locked as well.

user890155
  • 429
  • 4
  • 4
1

This exists in MySQL too. Here's the logic behind it: if you perform a SELECT on some data, you expect it to operate on an up-to-date dataset. That's why INSERT results in a table-level lock (that is unless the engine is able to perform row-level locking, like innodb can). There are ways to disable this behaviour, so that you can do "dirty reads", but they all are software (or even database-engine) specific.

shylent
  • 10,076
  • 6
  • 38
  • 55
  • 1
    I agree, but I also want it to acknowledge transactions. From my point of view the new records (those in the transaction) don't exist, and possibly never will. Why the database blocks on data that (from my point of view) doesn't exist seems wrong. I understand that if I was updating data in one place and SELECTing it from elsewhere I might want to wait for the update, but I want control of that... Especially with INSERTING. It seems that Oracle does as I expect. Still trying to work out if SQL Server can. – stevemac Apr 27 '09 at 06:10