2

I'm learning DotNet. I can easy generate iterator over iterators on Python, but what is the canonical way to do it in C#? I create function Slices (not tested, may have bugs) but what looks dangerous for me is creation of segments (with new): if input sequence is very big then a lot of new segments will be created. I do it because: 1) segment allow to keep size of slice - it can be cutted in the last slice 2) I don't know is it possible to yield enumerators.

Alternative way may be to create only one segment and to reinitialize it on each iteration, but ArraySegment seems not supports such thing (so, may be custom struct is needed for segment).

public static IEnumerable<ArraySegment<T>> Slices<T>(IEnumerable<T> sequence, int sliceSize) {
    T[] slice = new T[sliceSize];
    IEnumerator<T> items = sequence.GetEnumerator();
    int from = 0, to;
    while (true) {
        for (to = from; to < from + sliceSize && items.MoveNext(); to++) {
            slice[from - to] = items.Current;
        }
        from = to;
        yield return new ArraySegment<T>(slice, from, to - from);
    }
}

So, what is the right or/and canonical way to generate iterator of iterators in C# (needed in slices, groups, etc)

RandomB
  • 3,367
  • 19
  • 30
  • Does [this Q&A](https://stackoverflow.com/q/419019/335858) provide the information that you are looking for? – Sergey Kalinichenko Apr 24 '18 at 17:21
  • Another solution is to allocate one array (size = `sliceSize`) and to yield it on each iteration, but on last iteration if the last slice has size < sliceSize to allocate another array, so I'll have 2 allocations only. But what is the canonical way? What is the fastest way? New objects are allocated in heap, yield of struct - yields it from stack? Or? – RandomB Apr 24 '18 at 17:23
  • @dasblinkenlight In Python iterator is not a list - items can be readed from I/O, etc. As I know now, in C# too, so solution of splitting of lists is not the same – RandomB Apr 24 '18 at 17:25
  • 3
    It seems like you'd want `IEnumerable>`. Also you might want to check out MoreLinq's `Batch` method. https://github.com/morelinq/MoreLINQ/blob/master/MoreLinq/Batch.cs – juharr Apr 24 '18 at 17:26

1 Answers1

1

There is nothing wrong with returning IEnumerator<T> using yield:

public static IEnumerable<IEnumerable<T>> 
     Slice<T>(this IEnumerable<T> sequence, int sliceSize)
{
    while(sequence.Any()) 
    {
        yield return sequence.Take(sliceSize);
        sequence = sequence.Skip(sliceSize);
    } 
}
adjan
  • 13,371
  • 2
  • 31
  • 48
  • 2
    The main issue here is the use of `Count` which potentially requires a full iteration of `sequence` and is problematic if `sequence` is an unending stream. – juharr Apr 24 '18 at 17:29
  • @juharr fixed that – adjan Apr 24 '18 at 17:32
  • In fact, this is very wrong to return the same enumerator. Every time you use .Any() or consume returned IEnumerable the initial stream will be re-created. That is if initial stream was stream of File lines or Db rows then it will be reopened or query will be re-executed. Please, try add logging to IEnumerable creation like in following fiddle https://dotnetfiddle.net/X0rA3v – Maxim Kosov Apr 24 '18 at 20:26
  • @MaximKosov @juharr Either you want the possiblilty deferred execution or you don't. Add a `ToList()` if re-enumerating the original collection should be prohibited. – adjan Apr 24 '18 at 21:10