1

I need to select random rows from a table based on weight in another row. Like if the user enters random value 50 I need to select 50 random rows from the table being that the rows with higher weight gets returned more number of times. I saw using NEWID() to select n number of random rows and this link

Random Weighted Choice in T-SQL

where we can select one row based on the weight from another row but I need to select several rows based on user random input number ,so will the best way be using the suggested answer in the above link and looping over it n number of times(but I think it would return the same row) is there any other easy solution.

MY table is like this

ID Name Freq
1  aaa  50
2  bbb  30
3  ccc  10

so when the user enters 50 I need to return 50 random names so it should be like more aaa ,bbb than ccc.Might be like 25 aaa 15 bbb and 10 ccc. Anything close to this will work to.I saw this answer but when I execute against my DB it seems to be running for 5mins and no results yet. SQL : select one row randomly, but taking into account a weight

Community
  • 1
  • 1
Lav
  • 99
  • 1
  • 13

2 Answers2

2

I think the difficult part here is getting any individual row to potentially appear more than once. I'd look into doing something like the following:

1) Build a temp table, duplicating records according to their frequency (I'm sure there's a better way of doing this, but the first answer that came to my mind was a simple while loop... This particular one really only works if the frequency values are integers)

create table #dup
(
    id  int,
    nm  varchar(10)
)

declare @curr int, @maxFreq int
select @curr=0, @maxFreq=max(freq)
from tbl

while @curr < @maxFreq
 begin
    insert into #dup
    select id, nm
    from tbl
    where freq > @curr

    set @curr = @curr+1
 end

2) Select your top records, ordered by a random value

select top 10 *
from #dup
order by newID()

3) Cleanup

drop table #dup
chezy525
  • 4,025
  • 6
  • 28
  • 41
  • Thanks ! I think for now I can use this solution only thing is the random selection again over the temptable doesn't return maximum optimized results.Hoping to improve it in c# side if I find some sol – Lav May 28 '13 at 22:34
  • 1
    @Lav, what do you mean by "optimized results"? If the results are random, you won't always get an even distribution based on the weights. However, a row with a higher weight will be more likely to occur in the result set than a row with a lower weight. – chezy525 May 29 '13 at 15:28
  • Yes you are right when I was executing this I thought iam not getting always the id's with max frequency but that's what random means I suppose :-) – Lav Jun 03 '13 at 19:23
0

Maybe could you try something like the following:

ORDER BY Freq * rand()

in your sql? So columns with a higher Freq value should in theory get returned more often than those with a lower Freq value. It seems a bit hackish but it might work!

beiller
  • 3,105
  • 1
  • 11
  • 19
  • My first thought too but the requirement is that the same row should be able to be returned multiple times as per the frequency. This would only allow it to be returned once. – Martin Smith May 28 '13 at 18:54
  • This doesn't select the rows with more freq ,it just returns random values.Freq is not the number of times the value appears in the table its just another column that describes weight of that row.So even if the freq is 3000 that row does not appear 3000 times in the table to get selected frequently. – Lav May 28 '13 at 18:56
  • I dont think you can arbitrarily duplicate rows in mysql so if thats the requirement, mysql is not the solution. instead should be using PHP. Also this does not select random rows. It selects (random x Freq). Since rand() returns a random float [0...1] if freq is lower given 2 random but equal values, the higer freq wins. this is my intention. – beiller May 28 '13 at 19:00
  • @beiller got it thanks ,but this selects the rows with freq higher first but only once.Actually this seems to work can you take a look whether its right SELECT top(100) t.*, RAND() * t.Freq AS w FROM table t ORDER BY w DESC – Lav May 28 '13 at 19:07
  • Ah also I thought this was mysql / php haha. If you have a small list of initial names that you want duplicated do it in the application layer. Loop over the list and pick a random value each time. – beiller May 28 '13 at 19:09