1

I need to create batches from a lazy enumerable with following requirements:

  • Memory friendly: items must be lazy loaded even within each batch (IEnumerable<IEnumerable<T>>, excludes solution building arrays)
  • the solution must not enumerate twice the input (excludes solutions with Skip() and Take())
  • the solution must not iterate through the entire input if not required (exclude solutions with GroupBy)

The question is similar but more restrictive to followings:

Community
  • 1
  • 1
gremo
  • 13
  • 4

1 Answers1

4

Originally posted by @Nick_Whaley in Create batches in linq, but not the best response as the question was formulated differently:

Try this:

public static IEnumerable<IEnumerable<T>> Bucketize<T>(this IEnumerable<T> items, int bucketSize)
{
    var enumerator = items.GetEnumerator();
    while (enumerator.MoveNext())
        yield return GetNextBucket(enumerator, bucketSize);
}

private static IEnumerable<T> GetNextBucket<T>(IEnumerator<T> enumerator, int maxItems)
{
    int count = 0;
    do
    {
        yield return enumerator.Current;

        count++;
        if (count == maxItems)
            yield break;

    } while (enumerator.MoveNext());
}

The trick is to pass the old-fashion enumerator between inner and outer enumeration, to enable continuation between two batches.

Community
  • 1
  • 1
jeromerg
  • 2,997
  • 2
  • 26
  • 36