6

I want select random row with different probability column based:

ID     Type
Bike   1
Moto   1
Asus   2
Car    1
Apple  2
John   3

If i do this i will have random probability:

select top 1 * from Items order by newid()

I want John (type 3) has 70% probability to get, and 5% for type 1 and 25% for type 2.

Cœur
  • 37,241
  • 25
  • 195
  • 267
oteal
  • 614
  • 2
  • 13
  • 25
  • 1
    possible duplicate of [Select random row from a PostgreSQL table with weighted row probabilities](http://stackoverflow.com/questions/13040246/select-random-row-from-a-postgresql-table-with-weighted-row-probabilities) – Magnus Mar 21 '13 at 15:36

2 Answers2

4

I would use the RAND() function instead of NEWID().

Using RAND(), we can generate a random number between 1 and 100, and then use a CASE statement to select a type based on the number randomly generated.

According to MSDN:

RAND() returns a pseudo-random float value from 0 through 1, exclusive

Meaning that multiplying RAND() by 100 will give us a number from 0 to 99. Adding 1 changes the range to 1 to 100.

If after selecting which type to return, you want to randomly select a record from that type, you can then add a SELECT TOP 1... ORDER BY NEWID() to get a random record of that type:

DECLARE @Random INT
SET @Random = (RAND() * 100) + 1

SELECT TOP 1 ID, Type
FROM Items
WHERE Type = CASE
  WHEN @Random > 30 THEN 3
  WHEN @Random BETWEEN 6 AND 30 THEN 2
  ELSE 1
END
ORDER BY NEWID()

See it here... run it a few times to see that the results match the probabilities.

Michael Fredrickson
  • 36,839
  • 5
  • 92
  • 109
  • What's the point of the + 1? It seems like it throws off the percentages making it 4% for 1, 25% for 2 and 71% for 3. – Eric J. Price Mar 21 '13 at 15:43
  • @Love2Learn I updated the answer with an explanation for the `+1`. Without it the range would be 0 to 99 based on the possible return values for `RAND()` – Michael Fredrickson Mar 21 '13 at 15:47
  • Yeah, just noticed you were implicitly converting to an int, right on. I think you need to adjust the ranges because there are only 4 chances for a 1, 25 for a 2 and 71 for a 3. Change 3 to 31+ and 2 to 6 through 30? – Eric J. Price Mar 21 '13 at 15:48
  • @Love2Learn You're right... thanks for pointing that out! Should be fixed now. – Michael Fredrickson Mar 21 '13 at 15:55
1

You mean 5% probability for entire type=1 group, or you want every record of type=1 to have 5% probability of being selected? If it's second option, then you have 70+15+50=135 = no way you can do this. If it's first option, then you'll have to make 2 draws - first for a type, and then for a row in this type.

AdamL
  • 12,421
  • 5
  • 50
  • 74