4

I was going around some document in msdn and it said that "No shared locks are issued to prevent other transactions from modifying data read by the current transaction".

So in lay man term(i.e mine) this would cause the problem of dirty read. Which is too dangerous, if so then why to used it?

Does anybody knows the practical scenario where it would be used.

shadab shah
  • 551
  • 1
  • 6
  • 8
  • 1
    To summmarize: it is an extreme oversimplification to say that reading potentially dirty info is "too dangerous". In many cases, it is perfectly fine. – Andrew Barber Jun 05 '12 at 11:41
  • Andrew; I disagree. Sometimes you need to read data that is dirty, and cannot wait for locks. In this case, it is not only not dangerous, it is the correct behavior. – David Manheim Jun 05 '12 at 13:01
  • As Andrew mentions, dirty reads are usually okay and the performance increase of the query is a benefit. However if you do long transactions with multiple inserts on multiple tables which you occasionally rollback, it may not be good for you. If you keep your transactions small, you'll be fine to use NOLOCK - or setting your isolation level. – SQLMason Jun 05 '12 at 13:03
  • I liken this to HTML iframes. We're "taught" to avoid them at all cost, where in the right context, they are the right tool for the job. – SQLMason Jun 05 '12 at 13:07
  • 1
    @DavidManheim Yes, that's what I mean. The OP seemed to be saying dirty reads are too dangerous. I was suggesting that this is not the case; many times, it's perfectly fine (and as you note, even desired, insofar as you want data *fast* as opposed to necessarily 100% *accurate*). – Andrew Barber Jun 05 '12 at 15:01

8 Answers8

5

In our case (previous job) we used this for getting ballpark numbers. So for example a table that holds millions of e-mails sent in a given day, if at 5:00 we want to see "where we are" we can say:

SELECT COUNT(*) FROM dbo.MessageTable WITH (NOLOCK)
  WHERE CampaignID = x AND CustomerID = y;

A lot of people will suggest using this for COUNT(*) with no WHERE clause. I would argue that if you're willing to accept some inaccuracy in COUNT(*) you may as well do:

SELECT SUM(rows) FROM sys.partitions
  WHERE [object_id] = OBJECT_ID('dbo.tablename')
  AND index_id IN (0,1);

This number is similarly inaccurate due to in-flight transactions, but doesn't actually have to scan the table, so it is far more efficient. For our case we could use this even for subsets: with a filtered index in place (for other purposes) we could similarly query sys.partitions but use the index_id of the filtered index.

In most cases, though, using NOLOCK may feel like a turbo button, but the inaccuracy it may cause is very rarely worth it. Unless your system is already heavily tempdb-bound, you should consider READ_COMMITTED_SNAPSHOT for your current NOLOCK scenarios instead. See Pros and cons of READ_COMMITTED_SNAPSHOT

Community
  • 1
  • 1
Aaron Bertrand
  • 272,866
  • 37
  • 466
  • 490
1

I usually use this to query a relatively busy table I normally use for logging.

SELECT TOP 10 * FROM dbo.MessageLog (NOLOCK) WHERE AppCode = 'DesktopApp' ORDER BY MessageDate DESC

The table's records are mainly written once, and never updated.

YS.
  • 1,808
  • 1
  • 19
  • 33
  • 1
    Yep. I've used it for that kind of thing before too where you know that your reading query is likely to be blocked by concurrent inserts and also know that rolledback inserts are either very unlikely or it doesn't really matter if you do end up reading data that is rolled back. The `READPAST` hint could be used to read only committed data. – Martin Smith Jun 05 '12 at 11:27
1

Depending on the circumstances it can provide faster access if you are willing to accept that some records might be out of date.

For example:

SELECT COUNT(*) FROM mytable (nolock) 

will use less resources and generally be faster on a large table than

SELECT COUNT(*) FROM mytable 
pd40
  • 3,187
  • 3
  • 20
  • 29
  • 2
    In this case it is probably better to query `sys.partitions` which is about as accurate as `SELECT COUNT(*) WITH (NOLOCK)` but doesn't take as long. – Aaron Bertrand Jun 05 '12 at 12:11
1

Not all data is created equal. Quite a lot of data isn't actually critical, and so it is ok, in some situations, to have dirty reads going on. For instance, a batch process could be bulk updating a number of different tables. You may want to read the the number of Read Messages for a user. You'd prefer if the number was out by 1 or 2 rather than the user having to wait seconds, minutes, or hours for the locks to become available.

In other words, you're increasing concurrency when the exact data isn't actually required, and it's ok to have some (potentially) invalid data.

Josh Smeaton
  • 47,939
  • 24
  • 129
  • 164
1

We have a table holding a queue that is only ever inserted to and updated; nothing is ever deleted. The rows have different flags that indicate what is occurring with the associated process. The production system only uses rowlocks, but dozens at a a time are held on various rows, so that different processes can occur at once.

We check how many items are still being processed, to avoid overloading the system. When a user launches a new process, it waits to spawn sub-processes until there are fewer than 10 items currently being processesd in the queue.

We need to prevent taking locks on processes that may need them to update their owns status, and we need to see the status of locked items. We use with (nolock) to prevent waiting to see what is going on in the queue, and to count how many items are not yet marked complete - which is guaranteed by the logic to only occur when the process completes.

David Manheim
  • 2,553
  • 2
  • 27
  • 42
0

we use it on tables that have lots of reads, but few writes. If the connections are just reading data then this is often no danger in doing dirty reads. This prevents blocking on the tables which increases performance.

Check out https://stackoverflow.com/a/1453000/1038940 for some more info about nolock and when it can be good/bad

Community
  • 1
  • 1
Greg
  • 3,442
  • 3
  • 29
  • 50
  • 1
    If there is no danger of reading dirty data then why bother with the hint? It will only in fact be of any use when there is dirty data. SQL Server often won't take `S` locks out anyway in the case you describe. See [The Case of the Missing Shared Locks](http://sqlblog.com/blogs/paul_white/archive/2010/11/01/read-committed-shared-locks-and-rollbacks.aspx) – Martin Smith Jun 05 '12 at 11:30
  • Sorry, probably wasn't quite clear enough. There is a danger of dirty reads because of the few writes, but the connection doing the read is just reading data and we aren't concerned if we get dirty reads or not (things like logging tables, but even things liek employee names that rarely change). – Greg Jun 05 '12 at 23:49
0

Quite easy for us - we fetch non-financial (and usally rarely changed) data (like customer address or goods description or various configuration options) with nolock hint, financial or quantitative data (like prices or balances) with "normal" locking.

Arvo
  • 10,349
  • 1
  • 31
  • 34
0

You are clearly a genius. It should never be used.

NOLOCK is often exploited as a magic way to speed up database reads, but I try to avoid using it whever possible.

The result set can contain rows that have not yet been committed, that are often later rolled back.

An error or Result set can be empty, be missing rows or display the same row multiple times.

This is because other transactions are moving data at the same time you're reading it.

READ COMMITTED adds an additional issue where data is corrupted within a single column where multiple users change the same cell simultaneously.

There are other side-effects too, which result in sacrificing the speed increase you were hoping to gain in the first place.

Now you know, never use it.

WonderWorker
  • 8,539
  • 4
  • 63
  • 74