4

I am trying to think of an elegant way of getting a random subset from a set in F#

Any thoughts on this?

Perhaps this would work: say we have a set of 2x elements and we need to pick a subset of y elements. Then if we could generate an x sized bit random number that contains exactly y 2n powers we effectively have a random mask with y holes in it. We could keep generating new random numbers until we get the first one satisfying this constraint but is there a better way?

6 Answers6

2

Not having a really good grasp of F# and what might be considered elegant there, you could just do a shuffle on the list of elements and select the first y. A Fisher-Yates shuffle even helps you in this respect as you also only need to shuffle y elements.

Brian
  • 117,631
  • 17
  • 236
  • 300
Joey
  • 344,408
  • 85
  • 689
  • 683
  • F# sets don't allow for random access time O(1) so this algorithm would be ineffective without the set being converted into an array. – gradbot Jul 14 '09 at 16:13
  • Ok, would lie in the same complexity as your solution then, if I see that correctly. But you have code, I didn't get around learning F# so far, so you get my +1 :) – Joey Jul 15 '09 at 00:05
2

Agree with @JohannesRossel. There's an F# shuffle-an-array algorithm here you can modify suitably. Convert the Set into an array, and then loop until you've selected enough random elements for the new subset.

Brian
  • 117,631
  • 17
  • 236
  • 300
  • It seems that @usernameIdentifier is a common convention here for referring to others (that I believe has been co-opted from another medium), so I am just attempting to go with the flow. – Brian Jul 14 '09 at 12:11
2

If you don't want to convert to an array you could do something like this. This is O(n*m) where m is the size of the set.

open System

let rnd = Random(0);
let set = Array.init 10 (fun i -> i) |> Set.of_array

let randomSubSet n set =
    seq { 
        let i = set |> Set.to_seq |> Seq.nth (rnd.Next(set.Count))
        yield i
        yield! set |> Set.remove i 
        }
    |> Seq.take n
    |> Set.of_seq

let result = set |> randomSubSet 3 

for x in result do
    printfn "%A" x    
gradbot
  • 13,732
  • 5
  • 36
  • 69
  • 1
    Since this approach is bound to perform poorly, you're better off converting to an array, shuffling, and then taking the first m results as a set - it's probably easier to read to boot. And if you _really_ don't want to convert your original set to an array, you could still generate a random boolean mask using an appropriately shuffled array (with m true's and n-m false's), and then simply zip the array with set, filter on the masks, and map back into the set - without ever converting the original set to an array, and still maintaining O(n) performance. – Eamon Nerbonne Jul 22 '09 at 11:42
  • Your code snippet usually gives me same set of items except of last one. I got running 4 times: `[0; 1; 5], [0; 1; 6], [0; 1; 2], [0; 1; 4]`. Obviously **it's not random** subset. And it happens because of that line `yield! set |> Set.remove i` – Ivan Akcheurov Dec 14 '11 at 09:14
  • `yield!` yields all elements of a sequence, in this case the original set with one item removed (`set |> Set.remove i`), which is wrong. The function should be recursive (`let rec randomSubSet n set = ...`) and you should `yield! set |> Set.remove i |> randomSubSet (n-1)`. – Vladislav Zorov Oct 13 '12 at 00:38
1

rnd must be out of subset function.

let rnd = new Random()
let rec subset xs = 
    let removeAt n xs = ( Seq.nth (n-1) xs, Seq.append (Seq.take (n-1) xs) (Seq.skip n xs) )
    match xs with 
    | [] -> []
    | _ -> let (rem, left) = removeAt (rnd.Next( List.length xs ) + 1) xs
           let next = subset (List.of_seq left)
           if rnd.Next(2) = 0 then rem :: next else next
The_Ghost
  • 2,070
  • 15
  • 26
1

Do you mean a random subset of any size?

For the case of a random subset of a specific size, there's a very elegant answer here:

Select N random elements from a List<T> in C#

Here it is in pseudocode:

RandomKSubset(list, k):
  n = len(list)
  needed = k
  result = {}
  for i = 0 to n:
    if rand() < needed / (n-i)
      push(list[i], result)
      needed--
  return result
Community
  • 1
  • 1
dreeves
  • 26,430
  • 45
  • 154
  • 229
0

Using Seq.fold to construct using lazy evaluation random sub-set:

let rnd = new Random()

let subset2 xs = let insertAt n xs x = Seq.concat [Seq.take n xs; seq [x]; Seq.skip n xs]
                 let randomInsert xs = insertAt (rnd.Next( (Seq.length xs) + 1 )) xs
                 xs |> Seq.fold randomInsert Seq.empty |> Seq.take (rnd.Next( Seq.length xs ) + 1)
The_Ghost
  • 2,070
  • 15
  • 26