2

I've seen the following:

Random element of List<T> from LINQ SQL

What I'm doing now - to get three random elements - is the following:

var user1 = users.ElementAt( rand.Next( users.Count() ) );
var user2 = users.Where(u => !u.Equals(user1)).ElementAt( rand.Next( users.Count() ) );
var user3 = users.Where(u => !u.Equals(user1) && !u.Equals(user2)).ElementAt( rand.Next( users.Count() ) );

But this is obviously unweildy and inefficient. How can I do this elegantly and efficiently with one trip to the database?

EDIT: Based on below answer, I made the following extension method:

    public static IQueryable<T> SelectRandom<T>(this IQueryable<T> list, int count) {
        return list.OrderBy(_ => Guid.NewGuid()).Take(count);
    }    

    var result = users.SelectRandom(3);

BUT it seems like this would be inefficient for large datasets. Another proposal is to take the .Count() of the IQueryable, select n random numbers that fall within that result, and then shoot a query to the db with the selected random indices... but the Count() might be expensive here.

Community
  • 1
  • 1
RobVious
  • 12,685
  • 25
  • 99
  • 181

1 Answers1

3

The following should work:

users.OrderBy(_ => Guid.NewGuid()).Take(3)

This retrieves the first 3 elements from the users table, while sorting them by a value which is different each time.

Compared to AD.Net's answer.. well you'd require a list of userids generated randomly.. it doesn't suggest a way to do that

Martin Booth
  • 8,485
  • 31
  • 31
  • Interesting... Would you mind explaining how this works, and how it compares to the other answer? – RobVious May 24 '14 at 03:34
  • @RobVious NewGuid() generates a new random GUID, so when youorder by it you will get a random ordering, and it'll be different everytime you execute that line. – slugster May 24 '14 at 03:39
  • Nice! I updated my question with an extension method for this. – RobVious May 24 '14 at 03:46
  • 1
    LOOKS COOL FOR 1 MILLION USERS TABLE. (we are executing some SQL here) There is a word "efficiently" in question, and we are generating GUIDS for each user and then sort all those strings, and take 3 from this beautiful list. – Valentin Kuzub May 24 '14 at 03:48
  • @ValentinKuzub - my db contains <10,000 rows, but I'm curious - how would you handle this for large datasets? – RobVious May 24 '14 at 03:51
  • A `Guid` isn't random. At least there is no guarantee that it will be. `System.Random` is always preferable, but you'd need to bring the records back into memory for that. – Enigmativity May 24 '14 at 03:54
  • Looks like http://stackoverflow.com/questions/3339192/linq-orderby-random this question handles all my concerns beautifully. Having random value in database is very important for "efficient" solution – Valentin Kuzub May 24 '14 at 03:56
  • From what I can see, the method proposed by Ian Mercer (above link) will still sort the whole table. The difference is the way it generates the sort key.. XORing a key in the database with a random seed is certainly quicker than generating a guid for each row, but it doesn't avoid the fact that the whole table will be sorted first – Martin Booth May 24 '14 at 04:11
  • @Enigmativity `NewGuid()` is not predictable or sequential, so works well for this problem. I'm not sure why you say it's "not random". System.Random is entirely predictable if the same seed is used. – slugster May 24 '14 at 05:50
  • @slugster - being "not predictable" nor "sequential" does not mean it is random. `System.Random` is predictable AND it is random. What you want from a random number stream is high degree of entropy and a uniform distribution - both of which `Guid` doesn't have. – Enigmativity May 24 '14 at 07:57
  • @enig Can you show that NewGuid() is predictable or not uniformly distributed? I suspect you can't, which makes it fine for this use. IOW it's random enough even if it doesn't fit some technical academic criteria (which System.Random doesn't fit either). – slugster May 24 '14 at 08:34
  • @slugster - Of course I can. See https://stackoverflow.com/questions/2621563/how-random-is-system-guid-newguid-take-two. Also, Eric Lippert says "Now, this is assuming that the GUIDs are chosen by a perfectly uniform random process. They are not. GUIDs are in practice generated by a high-quality pseudo-random number generator, not by a crypto-strength random number generator." & "GUIDs are guaranteed to be unique but not guaranteed to be random. Do not use them as random numbers. " (see http://blogs.msdn.com/b/ericlippert/archive/2012/05/07/guid-guide-part-three.aspx) – Enigmativity May 24 '14 at 09:09
  • @Enigmativity That's an excellent reference, that explanation adds some value to the answer. – slugster May 24 '14 at 10:05