4

Situation:

1) There is big TABLE1 (9GB data, 20GB idx space, 12M rows)
2) There are several UPDATE and UPDATE/SELECT on TABLE1 which are run one by one
3) Each UPDATE statement updates different columns
4) None of them are using previously updated column for calculation to new updated column
5) It takes a while to complete them all

Issue:

I want to run those UPDATEs at the same time, but im concerned about deadlocks. How to avoid them? SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED will help?

UPDATEs looks like:

update TABLE1 set col1 = subs.col2
from (select ID, col2 from TABLE2) subs
where TABLE1.ID = subs.ID

update TABLE1 set col10 = col2+col3+col4

update TABLE1 set col100 = col2 + subs.col4
from (
    select 
        b.ID, a.col4
    from 
        TABLE3 a 
        join TABLE1 b on TABLE1.ID2 = TABLE3.ID2
) subs
where TABLE1.ID = subs.ID

update TABLE1 set col1000 = col2+col3+col4
from TABLE1
join TABLE4 on TABLE4.date = TABLE1.date
join TABLE5 on TABLE5.ID3 = TABLE1.ID 
barthoos
  • 41
  • 1
  • 4
  • Will these updates run sequentially in one session or several sessions will execute these updates at the same time? why ı am asking this question; because you mentioned about ETL process. Generally ETL process runs in a single thread. – Esat Erkec Nov 02 '19 at 12:06

3 Answers3

0

Dirty reads with READ UNCOMMITTED might work if same columns not updated and not used in other clauses, but I'm afraid this is fragile solution.

For more consistent solution you can mix ROWLOCK/UPDLOCK/NOLOCK depends on operations. F.e.

UPDATE
    TABLE1 WITH (ROWLOCK)
SET
    col1 = TABLE2.col2
FROM
    TABLE1 WITH (ROWLOCK, UPDLOCK)
    INNER JOIN TABLE2 WITH (NOLOCK) ON (TABLE1.ID = TABLE2.ID)

If your statements updates mostly different rows, then ROWLOCK can be omitted.

In rare cases lock escalation might happens, but it can be limited by

ALTER TABLE TABLE1 SET (LOCK_ESCALATION = DISABLE)

BTW, what is the purpose of your solution? I don't think that you'll win a lot of performance and small partial updates can handle faster than large updates in parallel.

Stan
  • 1,931
  • 16
  • 17
  • I appreciate the effort, but Your solution still will end up with deadlocks since there are full table scans.. – barthoos Jun 09 '14 at 12:25
  • If you'll use UPDLOCK in selection clauses, then scan will acquire update->exclusive locks. Two threads can't share them, so I don't think you'll face deadlocks. Have updated answer with more details. – Stan Jun 09 '14 at 21:05
  • About purpose: its data warehouse database and this particular table I'm updating is main sales table. for EACH sales some values are calculated (like additional fees/costs). So basically those several updates i run are updating most of rows each. There are none "small updates" there. I have v. strong server with lot of RAM & CPU cores and running queries in parallel gives me boost in performance. BTW: I'm even thinking about N temp tables and one and only one huge update at the end of the ETL process. – barthoos Jun 10 '14 at 09:15
  • Temp tables might help, if you use SELECT INTO that is one of the least logged operations compared to updates, that is most logged (especially when you have updates in parallel, because if changes miss the log cache threads are writing the same records into WAL and as a result much more IO operations). Also the good practice for large updates is to split it into smaller updates. So if you have unique identifier/timestamp or any other ranged field, its better to split 12M records into 10K-100K blocks and update each other. – Stan Jun 10 '14 at 10:04
0

(1) Avoid sub-queries while updating. Multiple sub-queries can quickly lead to lock escalation and cause deadlock.

(2) Check out following discussion at TABLOCK vs TABLOCKX.

(3) for current blocking and locking check out the discussion at How to find out what table a page lock belongs to

Community
  • 1
  • 1
Bids
  • 3
  • 2
0

Another strategy: create a temp table holding the IDs of the rows you want to insert, along with the column's new value.

CREATE TABLE #tmp (
  RowID          int,
  NewCol1Value   ...,
  NewCol2Value   ...,
  NewCol2Value   ...
)

-- Insert into the tmp table
...

UPDATE Table1
  SET  Col1 = ISNULL(NewCol1Value, Col1),
       Col2 = ISNULL(NewCol2Value, Col2),
       ...
  FROM Table1 INNER JOIN #tmp ON Table1.RowID = #tmp.RowID
Code Different
  • 90,614
  • 16
  • 144
  • 163