0

I have a table which I'm updating, the update itself takes 8 seconds to complete. When a user comes in, I need to update the table and present them with a sum from the table:

Query

SELECT Jaccard/(SELECT SUM(Jaccard) FROM PreProcessed)
FROM PreProcessed
WHERE MinX = 25 AND MinY = 25
AND MaxX = 26 AND MaxY = 26

The problem is that multiple users can come in at the same time and the X lock of the UPDATE statement means that I cannot read from the table for a prolonged period of time.

I was wondering what type of LOCK I should be using to allow for a query to be able to SELECT from the previously committed data whilst the update is happening? I initially thought of using READCOMMITTED, but it doesn't appear to do what I want. I want the data before the update, not the partially committed data.

Stored Procedure

set statistics time ON

DECLARE @MinX INT = 0;
DECLARE @MinY INT = 0;
DECLARE @MaxX INT = 50;
DECLARE @MaxY INT = 50;

DECLARE @Lambda DECIMAL(10, 5) = 0.5;
DECLARE @ReverseLambda DECIMAL(10, 5) = 1 - @Lambda;

DECLARE @Area INT = (@MaxX - @MinX) * (@MaxY - @MinY);

UPDATE PreProcessed
SET Jaccard = (@ReverseLambda * Jaccard) + (@Lambda * dbo.fn_ComputeJaccard(@MinX, @MinY, @MaxX, @MaxY, @Area, MinX, MinY, MaxX, MaxY))
FROM PreProcessed

set statistics time off

What can I do to allow reads to happen during my update, if anything?

EDIT

select CAST(1 AS DECIMAL(38, 28))/CAST(1625625 AS DECIMAL(38, 28))

select CAST(1 AS DECIMAL(20, 10))/CAST(1625625 AS DECIMAL(20, 10))
Faraday
  • 2,904
  • 3
  • 23
  • 46
  • If you're OK with a dirty read, you could try setting transaction isolation level to read uncommitted: http://stackoverflow.com/questions/2471055/why-use-a-read-uncommitted-isolation-level – Dan Jan 16 '14 at 16:27
  • @Dan - Dirty read is a no-no. I really need it to read the previously committed data (before this whole update) it at all possible. I even thought about using temp tables, but the merge takes 12 seconds... Which again locks the table... – Faraday Jan 16 '14 at 16:38

1 Answers1

1

This is one of the times you need to denormalise. Create a table

create table PreProcessedTotal (
   JaccardTotal decimal(18, 4) not null
)

(substitute the appropriate data type). You need to add three triggers to table PreProcessed:

  • An Insert trigger to add the value of Jaccard in the new row
  • An Update, to add the Inserted value and substract the DELETED
  • A Delete trigger to subtract the deleted value

You can then use:

select Jaccard / JaccardTotal
from Preprocessed with (nolock)
cross join PreProcessedTotal with (nolock)

The with (nolock) may not be needed. You'll also need to populate the PreProcessedTotal table with the current total when you put it live.

simon at rcl
  • 7,326
  • 1
  • 17
  • 24
  • Wow... Just wow... Although I haven't implemented this, it's a great solution! At first I was concerned about dirty reads, but clearly that's not an issue. I'm glad I posted the question, I almost compromised with a `READPAST` lock! :) – Faraday Jan 16 '14 at 17:07
  • I am assuming that most of the 8 seconds is taken in doing the table scan to get the SUM. You should also check the efficiency of the query without the sum on the Min/Max values and add indexes on those columns if it's not efficient. – simon at rcl Jan 16 '14 at 17:10
  • Quick question... Why when I'm doing this: `Jaccard/JaccardTotal` does it return zero, but when I use `SUM` it returns the right value? JaccardTotal is a `DECIMAL(38,28)`... – Faraday Jan 16 '14 at 17:26
  • I don't know; as long as the min/max values are the same using SUM or not shouldn't make any difference. Are you sure the values were the same? The only way it can return 0 is if Jaccard is 0; if JaccardTotal is zero it will error. – simon at rcl Jan 16 '14 at 17:36
  • PS You can also select Jaccard and Jaccard Total separately in the same query to see just what its getting as well as the calced figure. – simon at rcl Jan 16 '14 at 17:39
  • Try the updated selects in the question, you'll see what I mean... :) – Faraday Jan 16 '14 at 17:39
  • Posted my question to another question: http://stackoverflow.com/questions/21169231/divide-decimal-by-decimal-and-get-zero – Faraday Jan 16 '14 at 17:52
  • Yep I see. Nothing you can do about that. decimal(38, 28) is well into Floating Point territory and FP numbers are not accurate. Decimals are 'managed' floating point numbers and (20, 10) is just about within the manageable range, while (38, 24) definitely isn't! That's just how FP numbers work.... – simon at rcl Jan 16 '14 at 17:54
  • Have a look ar http://stackoverflow.com/questions/21162624/indexing-float-values-in-python - the first answer gives a good overview. – simon at rcl Jan 16 '14 at 17:56