I have a some Seq<'T> and I need a function:
Seq.block : int -> Seq<'T> -> Seq<Seq<'T>>
where the resulting sequence is the original sequence, split up in blocks of fixed length.
So e.g. Seq.block 2 {1;2;3;4;5} would have to return {{1;2};{3;4};{5}}
The catch is that this will have to work both for finite and infinite sequences. I have various undertaken various attempts and the last one is working, but I'm hoping this can be done cleaner. I've left my various non-working attempts in here in hope that that might be of help to somebody too.
My first attempt was:
let block size = Seq.mapi (fun i x -> (i/size,x))
>> Seq.groupBy fst
>> Seq.map (fun (_,res) -> Seq.map snd res)
This works fine for finite sequences, but not for infinite sequences (the groupBy doesn't work lazy, which in retrospect is kind of obvious).
As a second attempt I tried (hoping that fold would operate lazily)
let block size = Seq.mapi (fun i x -> (i/size,x))
>> Seq.fold (fun (accIx,fullSeq,buildSeq) (ix,el) ->
if accIx = ix then (accIx,fullSeq, Seq.append buildSeq (Seq.singleton el))
else (accIx+1,Seq.append fullSeq (seq {yield buildSeq}),Seq.singleton el))
(0, Seq.empty,Seq.empty)
>> fun (_,full,rem) -> Seq.append full (seq { yield rem})
but that is even a lot worse (throws a stackoverflow on an test example where the first attempt worked, though my version has probably lost of room for improvement).
For attempt #3 I did:
let block size x = Seq.initInfinite (fun i -> x |> Seq.skip (i*size)
|> Seq.take size)
This actually works on an infinite sequence, but a. it doesn't work on a finite sequence and b. probably suffers from performance problems as you get further out in the tail of the sequence. The recursive attempt #4 I already pretty much knew was doomed to fail, but I figured I'd give it a try anyway:
let rec block size x =
Seq.append (seq { yield Seq.take size x})
(block size (Seq.skip size x))
Attempt #5 almost works:
let block size = Seq.windowed size
>> Seq.mapi (fun i x -> (i,x))
>> Seq.filter (fun (i,_) -> i % size = 0)
>> Seq.map (fun (_,x) -> Array.toSeq x)
but has the (slight) drawback that for a finite sequence, when the sequence length is not divisible by 'size', the short last element is not in the list. Next to this, I'm slightly worried over performance because of all the arrays being generated.
Mulling over this a little longer, I came up with Attempt #6:
let block size = Seq.mapi (fun i x -> (i % size,x))
>> Seq.unfold (fun state -> if Seq.isEmpty state then None
else
match Seq.tryFind (fun (i,_) -> i = size-1) state with
| None -> state |> Seq.map snd,
Seq.empty
| _ -> state |> Seq.take size
|> Seq.map snd,
Seq.skip size state
|> Some)
Happy to hear if there are any better alternatives.