I am trying to get 5 random number of rows from a large table (over 1 million rows) with a fast method.
So far what I have tested with these SQL queries:
Method 1
Select top 5 customer_id, customer_name
from Customer TABLESAMPLE(1000 rows)
order by newid()
This method estimated I/O cost is 0.0127546
so this is very fast (index scan nonclustered)
Method 2
select top 5 customer_id, customer_name
from Customer
order by newid()
This method's sort estimated I/O cost is 117.21189
and index scan nonclustered estimated I/O cost is 2.8735
, so this is affecting performance
Method 3
select top 5 customer_id, customer_name
from Customer
order by rand(checksum(*))
This method's sort estimated I/O cost is 117.212
and index scan nonclustered estimated I/O cost is 213.149
, this query is slower than all because estimated subtree cost is 213.228
so it's very slow.
UPDATE:
Method 4
select top 5 customer_id, customer_name, product_id
from Customer
Join Product on product_id = product_id
where (customer_active = 'TRUE')
order by checksum(newid())
This approach is better and very fast. All the benchmark testing is fine.
QUESTION
How can I convert Method 4 to LINQ-to-SQL? Thanks