Random row from Linq to Sql

Question

What is the best (and fastest) way to retrieve a random row using Linq to SQL when I have a condition, e.g. some field must be true?

You have two options for the order you check the true conditions. If the true condition happens on most items then just grab a random item then test and repeat while false. If rare let the database limit the options to the true condition and then grab one at random. — Rex Logan, Mar 15 '09 at 18:39
As with lots of answers on this site - second-rated is much better than the accepted one. — nikib3ro, Mar 12 '13 at 05:05

score 176 · Accepted Answer · edited Aug 24 '10 at 19:16

176

You can do this at the database, by using a fake UDF; in a partial class, add a method to the data context:

partial class MyDataContext {
     [Function(Name="NEWID", IsComposable=true)] 
     public Guid Random() 
     { // to prove not used by our C# code... 
         throw new NotImplementedException(); 
     }
}

Then just order by ctx.Random(); this will do a random ordering at the SQL-Server courtesy of NEWID(). i.e.

var cust = (from row in ctx.Customers
           where row.IsActive // your filter
           orderby ctx.Random()
           select row).FirstOrDefault();

Note that this is only suitable for small-to-mid-size tables; for huge tables, it will have a performance impact at the server, and it will be more efficient to find the number of rows (Count), then pick one at random (Skip/First).

for count approach:

var qry = from row in ctx.Customers
          where row.IsActive
          select row;

int count = qry.Count(); // 1st round-trip
int index = new Random().Next(count);

Customer cust = qry.Skip(index).FirstOrDefault(); // 2nd round-trip

edited Aug 24 '10 at 19:16

Armstrongest

15,181
13
67
106

answered Mar 15 '09 at 18:07

Marc Gravell

1,026,079
266
2,566
2,900

My table has approximately 30.000 rows, would you say it's small-to-mid size? – Julien Poulin Mar 15 '09 at 18:39
4

If is it 30k *after* the filter, I'd say no: don't use this approach. Do 2 round-trips; 1 to get the Count(), and 1 to get a random row... – Marc Gravell Mar 15 '09 at 19:59
+1 This is a much better approach than what I had said. I nominated mine for deletion. – Reed Copsey Mar 15 '09 at 20:39
Thank you. Based on your comment, il use the 2 round-trip version. – Julien Poulin Mar 15 '09 at 20:45
1

What if you want five (or "x") random rows? Is it best just to make six round-trips or is there a convenient way to implement it in a stored procedure? – Neal Stublen Jul 02 '09 at 14:00
3

@Neal S.: the order by ctx.Random() could be mixed with Take(5); but if you are using the Count() approach, I expect 6 round trips is the simplest option. – Marc Gravell Jul 02 '09 at 14:46
1

don't forget to add a reference to System.Data.Linq or the System.Data.Linq.Mapping.Function attribute wont work. – Jaguir Feb 19 '10 at 22:10
[sighs happily] Marc Gravell does it again! What would we do without him? THANK YOU! :) – Shaul Behr May 30 '11 at 16:47
@Shaul probably have better up-time on stackoverflow :p – Marc Gravell May 30 '11 at 17:46
9

I know this is old, but if you are selecting many random rows from a large table, see this: http://msdn.microsoft.com/en-us/library/cc441928.aspx I don't know if there's a LINQ equivalent. – jwd Sep 08 '11 at 22:39
@MarcGravell i am trying the 2nd option cause the table has many records, but some records are missing when returned, eg: when ` **first request** count 4 index 1 shows 3 product in view **second request** count 4 index 2 shows 2 product in view **third request** count 4 index 3 shows 1 product in view` may i know what i am missing ? – Shaiju T Jun 10 '15 at 08:57

score 63 · Answer 2 · edited Feb 23 '15 at 16:26

63

Another sample for Entity Framework:

var customers = db.Customers
                  .Where(c => c.IsActive)
                  .OrderBy(c => Guid.NewGuid())
                  .FirstOrDefault();

This does not work with LINQ to SQL. The OrderBy is simply being dropped.

edited Feb 23 '15 at 16:26

usr

168,620
35
240
369

answered Jan 10 '11 at 20:53

Konstantin Tarkus

37,618
14
135
121

4

Have you profiled this and confirmed that it works? In my tests using LINQPad, the order by clause is being dropped. – Jim Wooley Mar 24 '11 at 14:56
This is the best solution to this problem – reach4thelasers May 24 '12 at 14:25
9

This doesn't workin LINQ to SQL... maybe it works in Entity Framework 4 (not confirming it). You can only use .OrderBy with Guid if you are sorting a List... with DB it won't work. – nikib3ro May 30 '12 at 21:56
3

Just to finally confirm that this works in EF4 - it's great option in that case. – nikib3ro Mar 12 '13 at 05:04
1

Could you edit your answer and explain why the orderBy with a new Guid do the trick? Nice answer by the way :) – Jean-François Côté Jun 07 '13 at 12:15

Jon Skeet · Answer 3 · 2009-03-15T18:12:45.477

35

EDIT: I've only just noticed this is LINQ to SQL, not LINQ to Objects. Use Marc's code to get the database to do this for you. I've left this answer here as a potential point of interest for LINQ to Objects.

Strangely enough, you don't actually need to get the count. You do, however, need to fetch every element unless you get the count.

What you can do is keep the idea of a "current" value and the current count. When you fetch the next value, take a random number and replace the "current" with "new" with a probability of 1/n where n is the count.

So when you read the first value, you always make that the "current" value. When you read the second value, you might make that the current value (probability 1/2). When you read the third value, you might make that the current value (probability 1/3) etc. When you've run out of data, the current value is a random one out of all the ones you read, with uniform probability.

To apply that with a condition, just ignore anything which doesn't meet the condition. The easiest way to do that is to only consider the "matching" sequence to start with, by applying a Where clause first.

Here's a quick implementation. I think it's okay...

public static T RandomElement<T>(this IEnumerable<T> source,
                                 Random rng)
{
    T current = default(T);
    int count = 0;
    foreach (T element in source)
    {
        count++;
        if (rng.Next(count) == 0)
        {
            current = element;
        }            
    }
    if (count == 0)
    {
        throw new InvalidOperationException("Sequence was empty");
    }
    return current;
}

edited Mar 15 '09 at 18:12

answered Mar 15 '09 at 18:03

Jon Skeet

1,421,763
867
9,128
9,194

4

FYI - I ran a quick check and this function does have a uniform probability distribution (the incrementing count is essentially the same mechanism as the Fisher-Yates shuffle so it seems reasonable that it should be). – Greg Beech Mar 15 '09 at 18:54
@Greg: Cool, thanks. It looked okay to me with a quick check, but it's so easy to get off-by-one errors in code like this. Virtually irrelevant to LINQ to SQL of course, but useful nonetheless. – Jon Skeet Mar 15 '09 at 18:57
@JonSkeet, hi, can you check [this](http://stackoverflow.com/q/30752514/2218697) and let me know what i am missing – Shaiju T Jun 10 '15 at 09:54
@TylerLaing: No, there's not meant to be a break. On the first iteration, `current` will *always* be set to the first element. On the second iteration, there's a 50% change that it will be set to the second element. On the third iteration, there's a 33% chance it will be set to the third element. Adding a break statement would mean you'd always exit after reading the first element, making it not random at all. – Jon Skeet Feb 28 '18 at 07:48
@JonSkeet Doh! I misread your use of count (e.g. was thinking this was Fisher-Yates style with a random range like n-i). But to select the first element in Fisher-Yates is to fairly choose any of the elements. However, that requires knowing the total number of elements. I see now that your solution is neat for an IEnumerable in that the total count is not known, and there is no need to iterate over the whole source just to get the count, to then iterate again to some randomly chosen index. Rather this solves in one pass, as you stated: "need to fetch every element unless you get the count". – Tyler Laing Feb 28 '18 at 20:40
Shorthand: `public static T RandomElement(this IEnumerable source, Random rng) => source.Where((x, i) => rng.Next(i + 1) == 0).Last();` – Ronnie Overby Aug 24 '19 at 19:50

score 21 · Answer 4 · answered Jul 27 '10 at 15:32

One way to achieve efficiently is to add a column to your data Shuffle that is populated with a random int (as each record is created).

The partial query to access the table in random order is ...

Random random = new Random();
int seed = random.Next();
result = result.OrderBy(s => (~(s.Shuffle & seed)) & (s.Shuffle | seed)); // ^ seed);

This does an XOR operation in the database and orders by the results of that XOR.

Advantages:-

Efficient: SQL handles the ordering, no need to fetch the whole table
Repeatable: (good for testing) - can use the same random seed to generate the same random order

This is the approach used by my home automation system to randomize playlists. It picks a new seed each day giving a consistent order during the day (allowing easy pause / resume capabilities) but a fresh look at each playlist each new day.

what would the effect on randomness be if instead of adding a random int field you just used an existing auto-incrementing identity field (the seed would obviously remain random)? also - is a seed value with a a maximum equal to the number of records in the table adequate or should it be higher? — Bryan, Oct 13 '11 at 17:19
Agreed, this is a great answer which IMO should have more upvotes. I used this in an Entity Framework query, and the bitwise-XOR operator ^ seems to work directly thus making the condition a bit cleaner: `result = result.OrderBy(s => s.Shuffle ^ seed);` (ie. no need to implement the XOR via the ~, & and | operators). — Steven Rands, Jun 29 '16 at 10:21

score 8 · Answer 5 · answered Aug 03 '11 at 06:18

8

if you want to get e.g. var count = 16 random rows from table, you can write

var rows = Table.OrderBy(t => Guid.NewGuid())
                        .Take(count);

here I used E.F, and the Table is a Dbset

answered Aug 03 '11 at 06:18

Artur Keyan

7,643
12
52
65

score 1 · Answer 6 · answered Apr 11 '11 at 14:32

If the purpose of getting random rows is sampling, I have talked very briefly here about a nice approach from Larson et al., Microsoft Research team where they have developed a sampling framework for Sql Server using materialized views. There is a link to the actual paper also.

score 0 · Answer 7 · edited Jul 24 '12 at 16:46

0

I have random function query against DataTables:

var result = (from result in dt.AsEnumerable()
              order by Guid.NewGuid()
              select result).Take(3);

edited Jul 24 '12 at 16:46

user7116

63,008
17
141
172

answered May 18 '12 at 07:11

midhun sankar

119
1
2

score 0 · Answer 8 · answered Jun 22 '12 at 12:54

Came here wondering how to get a few random pages from a small number of them, so each user gets some different random 3 pages.

This is my final solution, working querying with LINQ against a list of pages in Sharepoint 2010. It's in Visual Basic, sorry :p

Dim Aleatorio As New Random()

Dim Paginas = From a As SPListItem In Sitio.RootWeb.Lists("Páginas") Order By Aleatorio.Next Take 3

Probably should get some profiling before querying a great number of results, but it's perfect for my purpose

score 0 · Answer 9 · edited Jul 10 '13 at 09:21

0

Using LINQ to SQL in LINQPad as C# statements look like

IEnumerable<Customer> customers = this.ExecuteQuery<Customer>(@"SELECT top 10 * from [Customers] order by newid()");
customers.Dump();

The generated SQL is

SELECT top 10 * from [Customers] order by newid()

edited Jul 10 '13 at 09:21

iCollect.it Ltd

92,391
25
181
202

answered Sep 26 '12 at 01:15

JCO

1

score 0 · Answer 10 · edited Oct 09 '12 at 23:12

The example below will call the source to retrieve a count and then apply a skip expression on the source with a number between 0 and n. The second method will apply order by using the random object (which will order everything in memory) and select the number passed into the method call.

public static class IEnumerable
{
    static Random rng = new Random((int)DateTime.Now.Ticks);

    public static T RandomElement<T>(this IEnumerable<T> source)
    {
        T current = default(T);
        int c = source.Count();
        int r = rng.Next(c);
        current = source.Skip(r).First();
        return current;
    }

    public static IEnumerable<T> RandomElements<T>(this IEnumerable<T> source, int number)
    {
        return source.OrderBy(r => rng.Next()).Take(number);
    }
}

This code is not threadsafe and can only be used in single threaded code (so **not** ASP.NET) — Chris Marisic, Sep 17 '14 at 17:24

score 0 · Answer 11 · answered Apr 24 '13 at 01:11

i use this method for take random news and its work fine ;)

    public string LoadRandomNews(int maxNews)
    {
        string temp = "";

        using (var db = new DataClassesDataContext())
        {
            var newsCount = (from p in db.Tbl_DynamicContents
                             where p.TimeFoPublish.Value.Date <= DateTime.Now
                             select p).Count();
            int i;
            if (newsCount < maxNews)
                i = newsCount;
            else i = maxNews;
            var r = new Random();
            var lastNumber = new List<int>();
            for (; i > 0; i--)
            {
                int currentNumber = r.Next(0, newsCount);
                if (!lastNumber.Contains(currentNumber))
                { lastNumber.Add(currentNumber); }
                else
                {
                    while (true)
                    {
                        currentNumber = r.Next(0, newsCount);
                        if (!lastNumber.Contains(currentNumber))
                        {
                            lastNumber.Add(currentNumber);
                            break;
                        }
                    }
                }
                if (currentNumber == newsCount)
                    currentNumber--;
                var news = (from p in db.Tbl_DynamicContents
                            orderby p.ID descending
                            where p.TimeFoPublish.Value.Date <= DateTime.Now
                            select p).Skip(currentNumber).Take(1).Single();
                temp +=
                    string.Format("<div class=\"divRandomNews\"><img src=\"files/1364193007_news.png\" class=\"randomNewsImg\" />" +
                                  "<a class=\"randomNews\" href=\"News.aspx?id={0}\" target=\"_blank\">{1}</a></div>",
                                  news.ID, news.Title);
            }
        }
        return temp;
    }

score 0 · Answer 12 · answered Nov 13 '15 at 16:10

If you use LINQPad, switch to C# program mode and do this way:

void Main()
{
    YourTable.OrderBy(v => Random()).FirstOrDefault.Dump();
}

[Function(Name = "NEWID", IsComposable = true)]
public Guid Random()
{
    throw new NotImplementedException();
}

score 0 · Answer 13 · edited Jun 25 '16 at 22:18

0

var cust = (from c in ctx.CUSTOMERs.ToList() select c).OrderBy(x => x.Guid.NewGuid()).Taket(2);

Select random 2 row

edited Jun 25 '16 at 22:18

Vasfed

18,013
10
47
53

answered Jun 25 '16 at 21:44

Bahadır ATASOY

39
1

score 0 · Answer 14 · answered Oct 08 '17 at 13:44

To add to Marc Gravell's solution. If you are not working with the datacontext class itself (because you proxy it somehow e.g. to fake the datacontext for testing purposes), you cannot use the defined UDF directly: it will not be compiled to SQL because you're not using it in a subclass or partial class of your real data context class.

A workaround for this problem is to create a Randomize function in your proxy, feeding it with the query you want to be randomized:

public class DataContextProxy : IDataContext
{
    private readonly DataContext _context;

    public DataContextProxy(DataContext context)
    {
        _context = context;
    }

    // Snipped irrelevant code

    public IOrderedQueryable<T> Randomize<T>(IQueryable<T> query)
    {
        return query.OrderBy(x => _context.Random());
    }
}

Here is how you'd use it in your code:

var query = _dc.Repository<SomeEntity>();
query = _dc.Randomize(query);

To be complete, this is how to implement this in the FAKE datacontext (which uses in memory entities):

public IOrderedQueryable<T> Randomize<T>(IQueryable<T> query)
{
    return query.OrderBy(x => Guid.NewGuid());
}

Random row from Linq to Sql

14 Answers14

Linked

Related