14

I am fairly new to locks and hints.

I have a table with very frequent SELECT and INSERT operations. The table has 11 million records.

I have added a new column to it and I need to copy over the data from an existing column in the same table to the new column.

I am planning to use ROWLOCK hint to avoid escalating locks to table level locks and blocking out all other operations on the table. For example:

UPDATE 
    SomeTable WITH (ROWLOCK)
SET
    NewColumn = OldColumn

Questions:

  1. Would a NOLOCK instead of ROWLOCK? Note, once the records are inserted in the table, the value for OldColumn does not change, so NOLOCK would not cause dirty reads.
  2. Does NOLOCK even make sense in this case, because the SQL Server would have to anyways get update locks for UPDATE.
  3. Is there a better way of achieving this?

I know hints are to be avoided and SQL Server usually makes smarter choices, but I don't want to get the table locked out during this update.

HappyTown
  • 6,036
  • 8
  • 38
  • 51
  • The statement as you have it will update all records in the table within a single transaction so no matter how you cut it, you will be locking all records within the transaction. – MikeS Jan 24 '17 at 18:21
  • I once saw Kendra Little do a nice trick with update. The update would not go parallel, but to speed things up she used select in cte to achieve parallel read. – pacreely Jan 28 '17 at 08:31

4 Answers4

17

Try to update in batches.

DECLARE @Batch INT = 1000
DECLARE @Rowcount INT = @Batch


WHILE @Rowcount > 0
    BEGIN
        ;WITH CTE AS 
        (
            SELECT TOP (@Batch) NewColumn,OldColumn 
            FROM SomeTable 
            WHERE NewColumn <> OldColumn
                  OR (NewColumn IS NULL AND OldColumn IS NOT NULL)
        )
        UPDATE cte
            SET NewColumn = OldColumn;
        SET @Rowcount = @@ROWCOUNT
    END
pacreely
  • 1,881
  • 2
  • 10
  • 16
  • Thanks for showing that updates to CTE cascade to the underlying table. – HappyTown Jan 24 '17 at 19:20
  • one more variation of batch updates using `update...top` combination in my answer. – HappyTown Jan 27 '17 at 23:24
  • 1
    Keep batch size less than 5000 https://stackoverflow.com/questions/17905777/dataset-update-without-locking-sql-server-40-000-rows/17905913#17905913 – Michael Freidgeim Dec 01 '18 at 22:26
  • Doesn't this just update the CTE not the actual SomeTable? – Basil Mar 08 '21 at 22:30
  • @Basil the CTE isn't an object, it is a representation of the object SomeTable. If you look at the execution plans of queries that use CTEs, as far as the Optimiser is concerned the CTE doesn't exist. – pacreely Mar 09 '21 at 09:18
  • When you update a table, if you do not set the `Rowcount`, but the actual number of records to be updated are less than 5k, is it still going to lock the entire table? – hyankov Mar 08 '22 at 16:04
7

I took @pacreely's approach (see his answer on this question) of updating in batches and created a update...top variation. I added the (rowlock) hint tell SQL server to keep the locks at row level.

See update...top for details. Also note, that you cannot use order by when using top in update, insert, merge, delete statement so the referenced rows are not arranged in any order.

declare @BatchSize  int = 1000
declare @RowCount   int = @BatchSize

while @RowCount > 0
begin
    update top (@BatchSize) SomeTable with (rowlock)
    set NewColumn = OldColumn
    where 
        NewColumn <> OldColumn      or
        (
            NewColumn is null       and
            OldColumn is not null
        )
    select @RowCount = @@rowcount
end
HappyTown
  • 6,036
  • 8
  • 38
  • 51
2

This question has an answer on the Database Administrators site on StackExchange here: https://dba.stackexchange.com/questions/127158/how-to-get-sql-insert-and-or-update-to-not-lock-entire-table-on-ms-sql-server

Community
  • 1
  • 1
STLDev
  • 5,950
  • 25
  • 36
0

We recently had a case where we wanted to do something similar, but slowly over days (updating only a certain number of records per run, and only during certain hours). The most recent data was fine, but millions of rows of older data needed to be updated. Our data table looks something like this:

Create Table FileContent
(
FileContent varchar(max),
File_PK bigint,
NewFileContent varchar(max)
)

And we only needed to update certain rows, but millions of them. We created a table to store our progress so we could use a scheduled job to iterate through and update the main table, then populated this table with the primary keys of the main table records that needed updating:

Create Table FilesToUpdate
(
File_PK bigint,
IsUpdated bit NOT NULL DEFAULT 0
)

Then we scheduled the following script to do the updating (for your own use, play with the batch sizing and scheduling for what works with your system).

/***  
Script to update and fix records.
***/
DECLARE @Rowcount INT = 1 -- 
    ,   @BatchSize INT = 100 -- how many rows will be updated on each iteration of the loop 
    ,   @BatchesToRun INT = 25 -- the max number of times the loop will iterate
    ,   @StartingRecord BIGINT = 1;

-- Get the highest File_PK not already fixed as a starting point.
Select @StartingRecord = MAX(File_PK) From FilesToUpdate where IsUpdated = 0

-- While there are still rows to update and we haven't hit our limit on iterations... 
WHILE (@Rowcount > 0 and @BatchesToRun > 0)   
BEGIN
    print Concat('StartingRecord (Start of Loop): ', @StartingRecord)
    UPDATE FileContent SET  NewFileContent = 'New value here'
    WHERE File_PK BETWEEN (@StartingRecord - @BatchSize + 1) AND @StartingRecord;

    -- @@Rowcount is the number of records affected by the last statement.  If this returns 0, the loop will stop because we've run out of things to update.
    SET @Rowcount = @@ROWCOUNT;
    print Concat('RowCount: ', @Rowcount)

    -- Record which PKs were updated so we know where to start next time around.
    UPDATE FilesToUpdate Set IsUpdated = 1 where File_PK BETWEEN (@StartingRecord - @BatchSize + 1) AND @StartingRecord;

    -- The loop will stop after @BatchSize*@BatchesToRun records are updated. 
    -- If there aren't that many records left to update, the @Rowcount checks will stop it. 
    SELECT @BatchesToRun = @BatchesToRun - 1
    print Concat('Batches Remaining: ',@BatchesToRun)

    -- Set the starting record for the next time through the loop.
    SELECT @StartingRecord -= @BatchSize
    print Concat('StartingRecord (End of Loop): ', @StartingRecord)
END
Jen R
  • 1,527
  • 18
  • 23