How to Select UNCOMMITTED rows only in SQL Server?

Question

I am working on DW project where I need to query live CRM system. The standard isolation level negatively influences performance. I am tempted to use no lock/transaction isolation level read uncommitted. I want to know how many of selected rows are identified by dirty read.

score 9 · Accepted Answer · answered Mar 07 '14 at 09:02

9

Maybe you can do this:

SELECT * FROM T WITH (SNAPSHOT)
EXCEPT
SELECT * FROM T WITH (READCOMMITTED, READPAST)

But this is inherently racy.

answered Mar 07 '14 at 09:02

usr

168,620
35
240
369

Very nice but why snapshot - wouldn't read committed be enough? – benjamin moskovits Aug 18 '15 at 18:47
@benjaminmoskovits read committed can read x-locked rows. I know that this sounds hard to believe but it is true. RC only guarantees that committed data is read. No guarantees about locking. Also, it doesn't get better than snapshot. Why not use it? – usr Aug 18 '15 at 19:09
Snapshot is expensive - resource wise. – benjamin moskovits Aug 18 '15 at 19:12
@benjaminmoskovits what resources are you speaking of? Snapshot costs are kind of proportional to DML. They might therefore amount to near zero percent. Reads are super cheap, certainly when locks are avoided because of SI. – usr Aug 18 '15 at 19:13

Sergio · Answer 2 · 2014-03-07T09:43:56.567

Why do you need to know that?

You use TRANSACTION ISOLATION LEVER READ UNCOMMITTED just to indicate that SELECT statement won't wait till any update/insert/delete transactions are finished on table/page/rows - and will grab even dirty records. And you do it to increase performance. Trying to get information about which records were dirty is like punch blender to your face. It hurts and gives you nothing, but pain. Because they were dirty at some point, and now they aint. Or still dirty? Who knows...

upd

Now about data quality. Imagine you read dirty record with query like:

SELECT *
FROM dbo.MyTable
WITH (NOLOCK)

and for example got record with id = 1 and name = 'someValue'. Than you want to update name, set it to 'anotherValue` - so you do following query:

UPDATE dbo.MyTable
SET
    Name = 'anotherValue'
WHERE  id = 1

So if this record exists you'l get actual value there, if it was deleted (even on dirty read - deleted and not committed yet) - nothing terrible happened, query won't affect any rows. Is it a problem? Of course not. Becase in time between your read and update things could change zillion times. Just check @@ROWCOUNT to make sure query did what it had to, and warn user about results.

Anyway it depends on situation and importance of data. If data MUST be actual - don't use dirty reads

Data quality is the main reason to identify uncommitted rows. I do not really want to bring records to DW that actually were never committed unless I do not have a choice. I want to make this research & present to business so they we can make an optimal decision. — BI Dude, Mar 07 '14 at 09:17

score 4 · Answer 3 · answered Mar 07 '14 at 09:57

4

The standard isolation level negatively influences performance

So why don't you address that? You know dirty reads are inconsistent reads, so you shouldn't use them. The obvious answer is to use snapshot isolation. Read Implementing Snapshot or Read Committed Snapshot Isolation in SQL Server: A Guide.

But the problem goes deeper actually. Why do you encounter blocking? Why are reads blocked by writes? A DW workload should not be let loose on the operational transactional data, this is why we have ETL and OLAP products for. Consider cubes, columnstores, powerpivot, all the goodness that allows for incredibly fast DW and analysis. Don't burden the business operational database with your analytically end-to-end scans, you'll have nothing but problems.

answered Mar 07 '14 at 09:57

Remus Rusanu

288,378
40
442
569

Thanks for the link & I appreciate what you are saying. The main reason for blocking is filtered views of MS CRM Dynamics. ETL products still need to grab operational data that is the first step. Cubes/Powerpivot/Reporting come after the it. I'm a bit confused what you mean by a "DW workload should not be let loose on the operational transactional data" – BI Dude Mar 07 '14 at 10:09
A DW workload contains queries that scan for ranges ('daily sales', 'last week presence', 'NW region costs'). The operational OLTP workload is very narrow, one row or a small set of rows at a time ('insert one sale', 'update employee with ID 2', 'delete *this* pending order'). The OLTP workload writes do not interact with one another because the business process they represent does not overlap (no two operators try to insert the *same* invoice data). But the DW reads interact with the OLTP writes, because the OLTP writes fall into the DW queries ranges. – Remus Rusanu Mar 07 '14 at 12:25
1

This is why DW workloads interact so poorly with OLTP, they tend to block one another because DW 'snoops' at everything, due to range scans (or often even full scans). Even when the tables have necessary indexes for the DW scans, they still tend to block (eg. a report of 'today sales' may block the insert of a new sale, extreme example). SNAPSHOT helps a great deal, but has a cost. – Remus Rusanu Mar 07 '14 at 12:27

How to Select UNCOMMITTED rows only in SQL Server?

3 Answers3

Linked