0

This question is related to this thread, not sure if I am supposed to post this new question in that post or create a new post, but was worried it might not get seen as a follow up question within the old post ...

  • I need to replace a bunch of IDs with random but unique new ones in an SQL table.
  • I am using t-clausen.dk's code snip from this thread to generate the new IDs.
  • I am using a temp table to iterate through the old IDs, generating the new ID, and then updating my table with the new ID. (that part of the solution in this thread)
  • My problem is that all of the new IDs end up being the same. How can I get @r to clear out so that it can generate a new number?
  • Or is there a better way to solve this issue generally? Say without looping ...?

    SELECT * INTO #ControlTable FROM tempmaster 
    DECLARE @ei varchar(max)
    DECLARE @r  varchar(8)
    
    WHILE EXISTS (SELECT * FROM #ControlTable)
    BEGIN
        SELECT @ei = (SELECT TOP 1 externalid FROM #ControlTable ORDER BY externalid ASC)
    
        -- CREATE UNIQUE RANDOM ID
        SELECT @r = coalesce(@r, '') + n
        FROM (SELECT top 8 
        CHAR(number) n FROM
        master..spt_values
        WHERE type = 'P' AND 
        (number between ascii(0) and ascii(9)
        or number between ascii('A') and ascii('Z')
        or number between ascii('a') and ascii('z'))
        ORDER BY newid()) a
    
        -- REPLACE OLD ID
        UPDATE tempmaster SET externalid  = @r WHERE externalid = @ei
    
        DELETE #ControlTable WHERE externalid = @ei
    
        /*TESTING*/
        --SELECT @ei AS EI, @r AS [newID]
        --SELECT * FROM #ControlTable
        --SELECT * FROM tempmaster WHERE externalid = @ei OR externalid = @r
    END
    
    drop table #ControlTable
    

HERE IS AN OUTLINE OF AN ATTEMPT AT A SET-BASED APPROACH

    DECLARE @r varchar(8);
    SELECT oid, startdate, enddate,
        coalesce(@r, '') + n
        FROM (SELECT TOP 8 
        CHAR(number) n FROM
        master..spt_values
        WHERE type = 'P' AND 
        (number between ascii(0) and ascii(9)
        or number between ascii('A') and ascii('Z')
        or number between ascii('a') and ascii('z'))
        ORDER BY newid()) 
    as externalid
    FROM MasterTable
Community
  • 1
  • 1
dharol
  • 151
  • 2
  • 17

2 Answers2

2

Although your loop is not how I would approach the problem (a set-based approach is much better), your problem is that you do not re-initialize @r inside the loop.

So, add:

set @r = NULL;

to the beginning of the loop.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
1

I did not test this, but you could replace the loop by this UPDATE statement. It requires your current enternalid values to be unique:

WITH cte AS (
    SELECT
        externalid,
        concat(
            substring(chars, num % 62, 1),
            substring(chars, (num/62) % 62, 1),
            substring(chars, (num/3844) % 62, 1),
            substring(chars, (num/238328) % 62, 1),
            substring(chars, (num/14776336) % 62, 1),
            substring(chars, (num/916132832) % 62, 1),
            substring(chars, (num/56800235584) % 62, 1),
            substring(chars, (num/3521614606208) % 62, 1)
        ) AS unique_string8 
    FROM (
        SELECT
            externalid,
            (rn * 34524689549219 + seed) % 199689672115897 AS num,
            '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz' AS chars
        FROM (
            SELECT externalid,
                   ROW_NUMBER() OVER (ORDER BY externalid) AS rn,
                   0 AS seed
            FROM   tempmaster
        ) AS subq1
    ) AS subq2
)
UPDATE     t
SET        externalid = cte.unique_string8
FROM       tempmaster t
INNER JOIN cte
        ON cte.externalid = t.externalid;

The idea is that the cte query maps your existing externalid with the new 8-character key, and then performs the update using that mapping.

The characters are taken from a literal string with the 62 possible characters. The value num is a multiple of a prime plus some seed (0 in this case) and kept within in limits by taking the modulo by a prime that is close to the number of possible strings with 8 characters.

This resulting number is then translated into an 8 character string as if the number is to represented in 62-base. The denominators in the formula are powers of 62.

This way, it is guaranteed that you'll never get a duplicate (unless you have more records than possible 8-letter combinations).

There is no random part, so it will produce always the same series, but you can change the seed value to any integer number to change the series that will be produced.

trincot
  • 317,000
  • 35
  • 244
  • 286