269

In plain English, what are the disadvantages and advantages of using

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED

in a query for .NET applications and reporting services applications?

Michael Myers
  • 188,989
  • 46
  • 291
  • 292
Kip Real
  • 3,319
  • 4
  • 21
  • 28

10 Answers10

246

This isolation level allows dirty reads. One transaction may see uncommitted changes made by some other transaction.

To maintain the highest level of isolation, a DBMS usually acquires locks on data, which may result in a loss of concurrency and a high locking overhead. This isolation level relaxes this property.

You may want to check out the Wikipedia article on READ UNCOMMITTED for a few examples and further reading.


You may also be interested in checking out Jeff Atwood's blog article on how he and his team tackled a deadlock issue in the early days of Stack Overflow. According to Jeff:

But is nolock dangerous? Could you end up reading invalid data with read uncommitted on? Yes, in theory. You'll find no shortage of database architecture astronauts who start dropping ACID science on you and all but pull the building fire alarm when you tell them you want to try nolock. It's true: the theory is scary. But here's what I think: "In theory there is no difference between theory and practice. In practice there is."

I would never recommend using nolock as a general "good for what ails you" snake oil fix for any database deadlocking problems you may have. You should try to diagnose the source of the problem first.

But in practice adding nolock to queries that you absolutely know are simple, straightforward read-only affairs never seems to lead to problems... As long as you know what you're doing.

One alternative to the READ UNCOMMITTED level that you may want to consider is the READ COMMITTED SNAPSHOT. Quoting Jeff again:

Snapshots rely on an entirely new data change tracking method ... more than just a slight logical change, it requires the server to handle the data physically differently. Once this new data change tracking method is enabled, it creates a copy, or snapshot of every data change. By reading these snapshots rather than live data at times of contention, Shared Locks are no longer needed on reads, and overall database performance may increase.

Jon Adams
  • 24,464
  • 18
  • 82
  • 120
Daniel Vassallo
  • 337,827
  • 72
  • 505
  • 443
  • 24
    The author seems to imply that read uncommitted / no lock will return whatever data was last committed. My understanding is read uncommitted will return whatever value was last set even from uncommitted transactions. If so, the result would not be retrieving data "a few seconds out of date". It would (or at least could if the transaction that wrote the data you read gets rolled back) be retrieving data that doesn't exist or was never committed. Am I mistaken? – xr280xr May 11 '11 at 22:25
  • 3
    @xr280xr I don't think that's was Jeff was suggesting. Perhaps you formed that impression in the latter part of his article, where he refers to "read committed snapshot mode" as applied at the database level, e.g. ALTER DATABASE myDatabase SET READ_COMMITTED_SNAPSHOT ON. This mode affects the behavior of the "SET TRANSACTION ISOLATION LEVEL READ COMMITTED" statement: instead of being a read being blocked by someone else's transaction, it will read from a snapshot of the data taken at the beginning of your transaction. – Dan Nolan Aug 19 '11 at 07:50
  • 5
    Great answer. BTW, Oracle has "snapshot" by default since I know it, probably decades before Sql Server introduced it. I was quite disappointed when starting with SQL Server and I found out that all concurrency problems were solved using "primitive" locking mechanisms. Never saw "Read uncommitted" in Oracle. And the practitioner are as happy as the astronauts. – Stefan Steinegger Oct 10 '11 at 06:11
  • By "dirty" though don't they mean that a join between 2 tables might be wrong, not a single table being corrupt/invalid? This suggests it's purely a timing issue - that it could read table1's FK while table2's row is being deleted, and then hit table2 on a join and give you table1 rows that refer to table2 rows that are missing, but not table1 rows with a row half-updated or a value half-written: http://blogs.msdn.com/b/sqlcat/archive/2011/03/03/comparing-different-results-with-rcsi-amp-read-committed.aspx – Chris Moschini Mar 29 '13 at 20:53
  • 2
    @xr280xr that is correct, which is why you should not use read uncomitted for anything except simple, read-only scenarios. it's possible to read out a row that may not commit to the database, for large queries where such a row would appear at the tail of the result set AND where a user only ever sees the head of the result set this artifact doesn't matter and you can reduce lock contention DB-wide by not adding/acquiring additional shared locks which, given a specific case, are unnecessary. – Shaun Wilson Nov 19 '13 at 20:42
  • 1
    @ChrisMoschini a proper DBMS should still maintain data integrity (such as FK references) regardless of locking semantics such that a single atomic read (in the case of a SQL JOIN operation) should provide an accurate view of the data, regardless of transaction isolation level. If there's a DBMS that would perform a join where a key constraint could not be satisfied then there's a DBMS that is dysfunctional. For multi-select scenarios (not a join, but in fact two separate queries where the second is based on a FK value from the first) it's possible to select phantom rows, yes. – Shaun Wilson Nov 19 '13 at 20:47
  • 21
    [`READ UNCOMMITTED` can also cause you to read rows twice, or miss entire rows](http://blogs.msdn.com/b/sqlcat/archive/2007/02/01/previously-committed-rows-might-be-missed-if-nolock-hint-is-used.aspx). If a page-split happens while you're reading, then you can miss entire chunks of data. `WITH(NOLOCK)` should only be used if the accuracy of the results isn't important – Ian Boyd Feb 03 '14 at 18:36
  • 12
    @DanielNolan, Recommending that article is dangerous because [Jeff didn't know](http://samsaffron.com/archive/2008/08/27/Deadlocked+) what [he was doing](http://goo.gl/ACo07C). Read-uncommmited only makes sense for reading data will never be modified. Trying to use that to read tables that would be written to means that you **will in practice** read something that gets rolled back. It's not just that you are reading data that's a few seconds old, but you .....................................................‌​‌​............................... – Pacerier Dec 19 '14 at 04:27
  • 8
    ................................... are reading data that **never** even gets committed. That's the very definition of corrupted reads. And if you are going to **write based on the results** of those uncommitted reads, you **will in practice** be writing corrupted data. Also the article stated that "MySQL, which grew up on web apps, is much less pessimistic out of the box than SQL Server". Not true, Sql Server works at read-commited level by default, while MySQL works at repeatable-reads by default, [five levels](http://www.infoq.com/articles/eight-isolation-levels) away from read-uncommitted. – Pacerier Dec 19 '14 at 04:28
  • Can't upvote Ian Boyd comment enough. Read UNCommitted might cause row that are already COMMITTED to be missing from the result. I might also count the same row twice in some aggregate operation (sum, min max ..) giving result that are just plain wrong. – skyde Aug 08 '16 at 18:53
65

My favorite use case for read uncommited is to debug something that is happening inside a transaction.

Start your software under a debugger, while you are stepping through the lines of code, it opens a transaction and modifies your database. While the code is stopped, you can open a query analyzer, set on the read uncommited isolation level and make queries to see what is going on.

You also can use it to see if long running procedures are stuck or correctly updating your database using a query with count(*).

It is great if your company loves to make overly complex stored procedures.

neves
  • 33,186
  • 27
  • 159
  • 192
51

This can be useful to see the progress of long insert queries, make any rough estimates (like COUNT(*) or rough SUM(*)) etc.

In other words, the results the dirty read queries return are fine as long as you treat them as estimates and don't make any critical decisions based upon them.

Quassnoi
  • 413,100
  • 91
  • 616
  • 614
31

The advantage is that it can be faster in some situations. The disadvantage is the result can be wrong (data which hasn't been committed yet could be returned) and there is no guarantee that the result is repeatable.

If you care about accuracy, don't use this.

More information is on MSDN:

Implements dirty read, or isolation level 0 locking, which means that no shared locks are issued and no exclusive locks are honored. When this option is set, it is possible to read uncommitted or dirty data; values in the data can be changed and rows can appear or disappear in the data set before the end of the transaction. This option has the same effect as setting NOLOCK on all tables in all SELECT statements in a transaction. This is the least restrictive of the four isolation levels.

Mark Byers
  • 811,555
  • 193
  • 1,581
  • 1,452
25

When is it ok to use READ UNCOMMITTED?

Rule of thumb

Good: Big aggregate reports showing constantly changing totals.

Risky: Nearly everything else.

The good news is that the majority of read-only reports fall in that Good category.

More detail...

Ok to use it:

  • Nearly all user-facing aggregate reports for current, non-static data e.g. Year to date sales. It risks a margin of error (maybe < 0.1%) which is much lower than other uncertainty factors such as inputting error or just the randomness of when exactly data gets recorded minute to minute.

That covers probably the majority of what an Business Intelligence department would do in, say, SSRS. The exception of course, is anything with $ signs in front of it. Many people account for money with much more zeal than applied to the related core metrics required to service the customer and generate that money. (I blame accountants).

When risky

  • Any report that goes down to the detail level. If that detail is required it usually implies that every row will be relevant to a decision. In fact, if you can't pull a small subset without blocking it might be for the good reason that it's being currently edited.

  • Historical data. It rarely makes a practical difference but whereas users understand constantly changing data can't be perfect, they don't feel the same about static data. Dirty reads won't hurt here but double reads can occasionally be. Seeing as you shouldn't have blocks on static data anyway, why risk it?

  • Nearly anything that feeds an application which also has write capabilities.

When even the OK scenario is not OK.

  • Are any applications or update processes making use of big single transactions? Ones which remove then re-insert a lot of records you're reporting on? In that case you really can't use NOLOCK on those tables for anything.
Adamantish
  • 1,888
  • 2
  • 20
  • 23
  • Good point about reports. Actually the first idea that came to my mind was if I should use `read uncommitted` for web applications when user sees some UI grid where data accuracy is not so important. User just wants a quick overview what records might be there, and maybe with some paging, sorting & filtering. Only when user clicks Edit button, then I try to read the most current record with more strict isolation level. Shouldn't such approach be better in terms of performance? – JustAMartin Dec 14 '15 at 20:09
  • Yes, I think that's reasonable. Remember that the more significant issue is making sure that the data hasn't been changed by someone else between the time of hitting the edit button and the time of submit. You could handle that by starting a transaction fetching the data like `select item from things with (UPDLOCK)` . Put a quick timeout in there so that if it can't acquire the lock fast it tells the user it's being edited. That will keep you safe not just from users but developers. Only trouble here is you have to start thinking about timeouts and how you handle that in the UI. – Adamantish Dec 16 '15 at 00:14
8

Use READ_UNCOMMITTED in situation where source is highly unlikely to change.

  • When reading historical data. e.g some deployment logs that happened two days ago.
  • When reading metadata again. e.g. metadata based application.

Don't use READ_UNCOMMITTED when you know souce may change during fetch operation.

neo
  • 1,054
  • 1
  • 10
  • 19
  • 1
    I feel the reverse applies. Firstly static data should be read fine with no blocks. If it **does** block then you've now discovered there's an important hanging transaction problem to fix. Also users will expect this to match down to the last decimal place whatever they printed out for last year's annual report. They don't generally expect the same of reports they know are in constant flux. This doesn't go for detailed, extremely time-sensitive or financial reporting but if 1 inputting error in 1000 is tolerable then so is `READ UNCOMMITTED`. – Adamantish Nov 19 '14 at 17:09
  • TLDR: If data won't change you don't need READ UNCOMMITTED because there are no blocks anyway. If you're wrong and it does change then it's a good thing you blocked users from getting dirtier data than expected. – Adamantish Sep 18 '15 at 12:01
  • Yes, I tend to agree with @Adamantish here - you can benefit from `READ UNCOMMITTED` most in situations when your data is being actively used and you want to reduce the load on the server to avoid possible deadlocks and transaction rollbacks just because some users where carelessly abusing "Refresh" button in a web page with a datagrid. Users who view a bunch of records at the same time, usually do not care much if the data is a bit outdated or partially updated. Only when a user is about to edit a record, then you might want to give him/her the most accurate data. – JustAMartin Dec 14 '15 at 20:16
8

Regarding reporting, we use it on all of our reporting queries to prevent a query from bogging down databases. We can do that because we're pulling historical data, not up-to-the-microsecond data.

Hugh Seagraves
  • 594
  • 1
  • 8
  • 14
2

This will give you dirty reads, and show you transactions that's not committed yet. That is the most obvious answer. I don't think its a good idea to use this just to speed up your reads. There is other ways of doing that if you use a good database design.

Its also interesting to note whats not happening. READ UNCOMMITTED does not only ignore other table locks. It's also not causing any locks in its own.

Consider you are generating a large report, or you are migrating data out of your database using a large and possibly complex SELECT statement. This will cause a shared lock that's may be escalated to a shared table lock for the duration of your transaction. Other transactions may read from the table, but updates are impossible. This may be a bad idea if its a production database since the production may stop completely.

If you are using READ UNCOMMITTED you will not set a shared lock on the table. You may get the result from some new transactions or you may not depending where it the table the data were inserted and how long your SELECT transaction have read. You may also get the same data twice if for example a page split occurs (the data will be copied to another location in the data file).

So, if its very important for you that data can be inserted while doing your SELECT, READ UNCOMMITTED may make sense. You have to consider that your report may contain some errors, but if its based on millions of rows and only a few of them are updated while selecting the result this may be "good enough". Your transaction may also fail all together since the uniqueness of a row may not be guaranteed.

A better way altogether may be to use SNAPSHOT ISOLATION LEVEL but your applications may need some adjustments to use this. One example of this is if your application takes an exclusive lock on a row to prevent others from reading it and go into edit mode in the UI. SNAPSHOT ISOLATION LEVEL does also come with a considerable performance penalty (especially on disk). But you may overcome that by throwing hardware on the problem. :)

You may also consider restoring a backup of the database to use for reporting or loading data into a data warehouse.

Olle Johansson
  • 508
  • 3
  • 10
0

It can be used for a simple table, for example in an insert-only audit table, where there is no update to existing row, and no fk to other table. The insert is a simple insert, which has no or little chance of rollback.

Sofian
  • 11
-6

I always use READ UNCOMMITTED now. It's fast with the least issues. When using other isolations you will almost always come across some Blocking issues.

As long as you use Auto Increment fields and pay a little more attention to inserts then your fine, and you can say goodbye to blocking issues.

You can make errors with READ UNCOMMITED but to be honest, it is very easy make sure your inserts are full proof. Inserts/Updates which use the results from a select are only thing you need to watch out for. (Use READ COMMITTED here, or ensure that dirty reads aren't going to cause a problem)

So go the Dirty Reads (Specially for big reports), your software will run smoother...

Clive
  • 49
  • 2
  • 7
    This is very inaccurate and only touches the surface of problems that can occur with nolock. Pages can split, joins can fail, you can read non-existent data or duplicate data. There is no way to make its use foolproof: you cannot trust the accuracy of anything under this isolation level. READ COMMITTED SNAPSHOT is a less dangerous "fake panacea" – Mark Sowul Jan 24 '12 at 22:17
  • @MarkSowul The downvoting of this answer seems unfair to me. @Clive was clear that he'd switch to `Committed` for inserts and updates. As for other problems he also demonstrated awareness of page-split issues in mentioning using an Auto-incrementing key. I agree with him that nearly all live reporting made to be read by just a human can tolerate slight discrepancies in the final decimal place. I'd agree it's a different story for detailed listings or data destined to be machine read and transformed and so does Clive. – Adamantish Nov 19 '14 at 17:21
  • 2
    This comment also demonstrates a lack of full understanding of the possible problems that come with nolock. "Slight discrepancies in the final decimal place" hardly covers it. Even touching on the "just use read committed for inserts/updates" is wrong as general advice (what if it's "insert a record if it doesn't exist"?). In any event, "[read uncommitted] is fast with the least issues" is categorically wrong. – Mark Sowul Nov 20 '14 at 14:43
  • For the record, I agree with your answer, Adamantish: roughly accurate aggregations and little else. – Mark Sowul Nov 20 '14 at 14:45