2

I'm working on a project that requires me to write a function that selects a designated number of random elements from a set. Then map those elements to a variable for later comparison.

So in my scenario I have to select 5% of any given set.

let rec randomSet (a:Set<string>) =
let setLength = (a.Count / 100) * 5

let list = []
let rand = System.Random
if set.Length <> setLength then
    // some code will go here
    randomSet setLength eIDS
else
    set

^Please criticize my code, I've only being coding in F# for a week.

I've tried to do it recursively but I have a feeling that it's the wrong way to go. I have tried other methods but they use the .take function, and thus the returned collection is the same every time.

Any ideas? I'm not after 1 element from a set, I'm after 5% of any set that's thrown at it.

This is not the same question as this :How can I select a random value from a list using F#

If you think it is, please explain.

Community
  • 1
  • 1
Mark
  • 1,633
  • 1
  • 11
  • 26
  • If efficiency is a concern, you might want to look into [Reservoir Sampling](https://en.wikipedia.org/wiki/Reservoir_sampling). – kvb Aug 31 '16 at 16:53

2 Answers2

5

There are multiple ways of doing this. Depending on the number of elements in the input and the number of items you want to pick, different strategy might be more efficient.

Probably the simplest way is to sort the input by a random number and then use take to get the required number of elements:

let data = [| 0 .. 1000 |]

let rnd = System.Random()

data 
|> Seq.sortBy (fun _ -> rnd.Next())
|> Seq.take 50

This will randomly sort the sequence (which may be slow for large sequences), but then it takes precisely the number of elements you want (unlike Mark's solution, which will return roughly 5% of items).

If you wanted to select small number from a large list, it might be better to randomly generate indices (making sure that there are no duplicates) and then doing direct lookup based on the indices.

Tomas Petricek
  • 240,744
  • 19
  • 378
  • 553
  • Thank you. I tried the code in my program and with almost 3000 items it didn't effect the speed much, added an extra 20 seconds to an already 2 minute long run time. Also my brother bought me your book yesterday and I look forward to reading it. – Mark Aug 31 '16 at 16:36
3

Since Set<'a> implements Seq<'a>, this question is, in fact, a duplicate of How can I select a random value from a list using F# All you'd need to do is to shuffle the set, take the first 5% elements, and put it back into a set.

Just for the fun of it, though, here's another solution. If you need to pick 5%, then first define a predicate that returns true only 5% of the times it's called:

let r = System.Random ()
let fivePercent _ = r.NextDouble () < 0.05

You can now filter your set using that predicate:

let randomlySelectedSubset = stringSet |> Seq.filter fivePercent |> set
Community
  • 1
  • 1
Mark Seemann
  • 225,310
  • 48
  • 427
  • 736
  • I apologies, I should've said that I don't see how it's the same question. I'm new to F# and I haven't had much experience with sets or sequences. I'll take what you've said on board and focus more time into them. – Mark Aug 31 '16 at 16:23
  • And this is the winner if speed was an issue ;-) – Helge Rene Urholm Aug 31 '16 at 17:35
  • Although, this won't necessarily take exactly 5% of the set (5% is the expectation for the size you will take, but depending on the size of the set you could get a lot of deviation from this). – Pash101 Sep 01 '16 at 12:46