0

I'm trying to synthetically generate some seed values for a database. I have a list of employees and I want roughly 30% of them to be classified as "Minority" and the rest "Non-Minority", I thought that the following would work but it's classifying everyone as "Minority":

UPDATE datasetitems 
SET minority = CASE WHEN (FLOOR(RAND()*(10-1+1)+1) > 3) 
THEN 'Minority' ELSE 'Non-Minority' END;
superblowncolon
  • 183
  • 1
  • 1
  • 15

1 Answers1

1

For what you are trying to do RAND() will not work because it generates a single random value for the batch which is why you're not seeing any variance. To generate a random value for each row use:

ABS(CHECKSUM(NEWID())%<desired max random number>)

Note this sample data:

DECLARE @datasetitems TABLE (minority VARCHAR(20));
INSERT @datasetitems(minority)
SELECT TOP (10) NULL FROM sys.all_columns;

SELECT CASE ABS(CHECKSUM(NEWID())%2) WHEN 0 THEN 'minority' ELSE 'non-minority' END
FROM @datasetitems;

Returns:

------------
non-minority
non-minority
minority
minority
minority
non-minority
minority
non-minority
minority
minority

ABS(CHECKSUM(NEWID())%2) will return a 0 or 1 for each row meaning that you'll have a 50/50 chance that you'll return minority or non-minority. This logic:

SELECT CASE ABS(CHECKSUM(NEWID())%4) WHEN 0 THEN 'minority' ELSE 'non-minority' END
FROM @datasetitems;

... would mean that there's a 1 in 4 chance that "minority" will be returned and a 3 in 4 chance that "non-minority" is returned. To apply this logic to your update, your code would look like this:

UPDATE @datasetitems
SET minority = CASE ABS(CHECKSUM(NEWID())%2) WHEN 0 THEN 'minority' ELSE 'non-minority' END
FROM @datasetitems;
Alan Burstein
  • 7,770
  • 1
  • 15
  • 18