Update table with random record in update statment in SQL Server?

Question

I have two tables. Table 1 has about 80 rows and Table 2 has about 10 million.

I would like to update all the rows in Table 2 with a random row from Table 1. I don't want the same row for all the rows. Is it possible to update Table 2 and have it randomly select a value for each row it is updating?

This is what I have tried, but it puts the same value in each row.

update member_info_test
set hostessid = (SELECT TOP 1 hostessId FROM hostess_test ORDER BY NEWID())

**Edited

This will point you in the right direction : http://stackoverflow.com/questions/19412/how-to-request-a-random-row-in-sql — Landjea, Oct 25 '12 at 20:41
You don't want the same record for one? Difficult when the first table has 80 and the table you want to update has 10M records. — Tim Schmelter, Oct 25 '12 at 20:42
Well not all the same records for every record. I just want it to use the 80 records from that one table — chobo, Oct 25 '12 at 20:48
Do you need to do an update? Can you just remove all records and do an insert? — Abe Miessler, Oct 25 '12 at 20:48
Your query looks okay. The only thing I can think of is that the optimizer is executing the subquery only once. It should not be doing so, because `newid()` is volatile. — Gordon Linoff, Oct 25 '12 at 20:54

score 16 · Accepted Answer · answered Oct 25 '12 at 20:53

16

Ok, I think that this is one of the weirdest query that I've wrote, and I think that this is gonna be terrible slow. But give it a shot:

UPDATE A
SET A.hostessid = B.hostessId
FROM member_info_test A
CROSS APPLY (SELECT TOP 1 hostessId
             FROM hostess_test 
             WHERE A.somecolumn = A.somecolumn
             ORDER BY NEWID()) B

answered Oct 25 '12 at 20:53

Lamak

69,480
12
108
116

1

The values are all the same :( – chobo Oct 25 '12 at 21:00
1

@chobo - Really?, I tested this with sample data and it worked fine. But to get the different values, the `WHERE A.somecolumn = A.somecolumn` was mandatory – Lamak Oct 25 '12 at 21:06
I don't know why it works for you, but I get the same values in each row – chobo Oct 25 '12 at 21:59
Actually this query sort of works, but what confuses me is the Where A.somecolumn = A.somecolumn. I seem to get different results depending on the columns I use – chobo Oct 26 '12 at 16:19
1

It seems the more different values A.someColumn has the more random the results – chobo Oct 26 '12 at 16:23
@chobo that may be the reason. I tried it with the key for that table and it worked as intended. – Lamak Oct 26 '12 at 16:29
You would need to correlate on a unique value from `A` to be sure that if a spool is added it is always rebound not rewound. [Duplicate of this question](http://stackoverflow.com/a/12922951/73226) – Martin Smith Oct 26 '12 at 17:40
@MartinSmith I did that on my test query, but not knowing that it was a necessity. And your answer explains why it works that way, thanks – Lamak Oct 26 '12 at 18:19

Gordon Linoff · Answer 2 · 2017-06-14T11:47:54.077

1

I think this will work (at least, the with portion does):

with toupdate as (
      select (select top . . . hostessId from hostess_test where mit.hostessId = mit.hostessId order by newid()) as newval,
             mit.*
      from member_info_test mit
     )
update toupdate
    set hostessid = newval;

The key to this (and to Lamak's) is the outer correlation in the subquery. This is convincing the optimizer to actually run the query for each row. I don't know why this would work and the other version would not.

edited Jun 14 '17 at 11:47

answered Oct 25 '12 at 21:10

Gordon Linoff

1,242,037
58
646
786

If you put in a 1 where the `. . .` is, then it should work. Any idea why I cannot insert this code? – Gordon Linoff Oct 25 '12 at 21:15
This worked fine for me in SQL2012; I had to omit the `mit` alias in the `update` portion, however. All-in-all, a good, logical, solution for a one-off problem. – Paul Suart Jun 14 '17 at 09:30

Leblanc Meneses · Answer 3 · 2013-09-23T20:32:20.310

Here is what i ended up using:

EnvelopeInformation would be your Table 2

PaymentAccountDropDown would be your Table 1 (in my case i had 3 items) - change 3 to 80 for your usecase.

;WITH cteTable1 AS (
    SELECT
        ROW_NUMBER() OVER (ORDER BY NEWID()) AS n,
        PaymentAccountDropDown_Id
    FROM EnvelopeInformation
    ),
cteTable2 AS (
    SELECT 
        ROW_NUMBER() OVER (ORDER BY NEWID()) AS n,
        t21.Id
    FROM PaymentAccountDropDown t21
    )
UPDATE cteTable1
   SET PaymentAccountDropDown_Id = (
       SELECT Id 
       FROM cteTable2
       WHERE  (cteTable1.n % 3) + 1 = cteTable2.n
)

reference: http://social.technet.microsoft.com/Forums/sqlserver/pt-BR/f58c3bf8-e6b7-4cf5-9466-7027164afdc0/updating-multiple-rows-with-random-values-from-another-table

score 0 · Answer 4 · edited May 23 '17 at 12:09

0

Update Table with Random fields

UPDATE p
    SET p.City= b.City
    FROM Person p
    CROSS APPLY (SELECT TOP 1 City
                 FROM z.CityStateZip 
                 WHERE p.SomeKey = p.SomeKey and -- ... the magic! ↓↓↓
                 Id = (Select ABS(Checksum(NewID()) % (Select count(*) from z.CityStateZip)))) b

edited May 23 '17 at 12:09

Community

1
1

answered May 27 '15 at 16:45

CSharper

5,420
6
28
54

Update table with random record in update statment in SQL Server?

4 Answers4

Linked

Related