sql server - efficient way to update table from join

Question

i need to update suppID column on ordersTB table from suppTB table. tables shown below:

suppTB

suppID ,suppName , itemID ,....

ordersTB

itemID, suppID(added later so mostly null's)...

what the most efficient way to update this table. i tried this query from old Q's here (How can I do an UPDATE statement with JOIN in SQL?) ,but it executing over a day and still running

UPDATE ordersTB
SET A.suppID = B.suppID
FROM  ordersTB as A
JOIN  suppTB as B
    ON A.itemID = B.itemID
WHERE A.suppID is null

other notes & things i tried:

both tables are huge size (tens of millions)
itemID & suppID columns are indexed. i remove the indexs before the update to make it faster. still take too long.
also try MERGE query. same results..

Are you sure your SQL is correct? you're telling it you want to update ordersTB but this table/alias does not appear in the part of the query following the FROM — Caius Jard, Mar 04 '20 at 15:55
you right, my write here were worng, i just copy the correct query — rotem, Mar 04 '20 at 16:14
Post your plan output. Also tell us whether the query works if you swap ``UPDATE... SET ...` for `SELECT * FROM — Caius Jard, Mar 04 '20 at 16:17
This begs the question - why do you keep the `suppId` in both tables? Isn't the `ItemId` enough to identify the correct `SuppId` for each item? — Zohar Peled, Mar 04 '20 at 16:19
#Caius: my plan is very simple, to update table from another table. i will try your advise of testing the SELECT query after db will be available (still canceling the long lasting update query).; #Zohar: there is more columns on those tables that depends on it. also i need to update more tables this way so i curious what is the efficient way. if this had only those tables and colunms you probablly right! — rotem, Mar 04 '20 at 16:49
Does this answer your question? [How to update large table with millions of rows in SQL Server?](https://stackoverflow.com/questions/35903375/how-to-update-large-table-with-millions-of-rows-in-sql-server) — Eric Brandt, Mar 04 '20 at 17:57
As a rule of thumb, you really should strive to duplicate as little data as possible. Anything that's not used as a foreign key should be stored only on a single location. That might make select statements a little more cumbersome but ensures data integrity far better. — Zohar Peled, Mar 05 '20 at 06:08

score 1 · Answer 1 · answered Mar 04 '20 at 18:09

Updating lots of rows is expensive. The first thing to check is whether or not you are updating each row just once:

select s.item_ID, count(*)
from suppTB s
group by s.item_ID
having count(*) > 1
order by count(*) desc;

If this is returning any rows, then you have multiple updates to the same row. And that is a performance hit. You should fix the business logic. But you can put an arbitrary matching value in each row:

UPDATE o
SET suppID = s.suppID
FROM ordersTB o CROSS APPLY
     (SELECT TOP (1) s.*
      FROM suppTB s
      WHERE o.itemID = s.itemID
     )
WHERE o.suppID is null;

If this doesn't address the issue, you probably need to reload the table. The logic is something like:

select . . . , -- all columns except suppId
       s.suppId
into temp_ordersTb
from ordersTb o left join
     suppTb s
     on o.itemID = s.itemID;

truncate table ordersTb;

insert into ordersTb (. . .)
    select . . .
    from temp_ordersTb;

Of course, be sure you backup your tables before attempting this!

sql server - efficient way to update table from join

1 Answers1