462

Is there any way I can separate a List<SomeObject> into several separate lists of SomeObject, using the item index as the delimiter of each split?

Let me exemplify:

I have a List<SomeObject> and I need a List<List<SomeObject>> or List<SomeObject>[], so that each of these resulting lists will contain a group of 3 items of the original list (sequentially).

eg.:

  • Original List: [a, g, e, w, p, s, q, f, x, y, i, m, c]

  • Resulting lists: [a, g, e], [w, p, s], [q, f, x], [y, i, m], [c]

I'd also need the resulting lists size to be a parameter of this function.

Draken
  • 3,134
  • 13
  • 34
  • 54
Felipe Lima
  • 10,530
  • 4
  • 41
  • 39

34 Answers34

441

Try the following code.

public static List<List<T>> Split<T>(IList<T> source)
{
    return  source
        .Select((x, i) => new { Index = i, Value = x })
        .GroupBy(x => x.Index / 3)
        .Select(x => x.Select(v => v.Value).ToList())
        .ToList();
}

The idea is to first group the elements by indexes. Dividing by three has the effect of grouping them into groups of 3. Then convert each group to a list and the IEnumerable of List to a List of Lists

Kapé
  • 4,411
  • 3
  • 37
  • 54
JaredPar
  • 733,204
  • 149
  • 1,241
  • 1,454
  • 33
    GroupBy does an implicit sort. That can kill performance. What we need is some kind of inverse of SelectMany. – yfeldblum Jan 07 '09 at 03:19
  • 3
    @Justice, it would be nice to have a built-in partitioning system for IEnumerable. – JaredPar Jan 07 '09 at 03:31
  • @JaredPar You can do that easily enough with an extension method. I suspect it's not in there, in part, because it doesn't integrate well with SQL. 'myEnumerable.InGroupsOf(3).Select(subEnumerable => subEnumerable.Sum()).Average()' plus overloads would be nice. – yfeldblum Jan 07 '09 at 04:17
  • 6
    @Justice, GroupBy might be implemented by hashing. How do you know GroupBy's implementation "can kill performance"? – Amy B Jan 07 '09 at 14:43
  • 7
    GroupBy doesn't return anything until it's enumerated all elements. That's why it's slow. The lists OP wants are contiguous, so a better method could yield the first sublist `[a,g,e]` before enumerating any more of the original list. – Colonel Panic Jul 11 '12 at 14:20
  • 11
    Take the extreme example of an infinite IEnumerable. `GroupBy(x=>f(x)).First()` will never yield a group. OP asked about lists, but if we write to work with IEnumerable, making only a single iteration, we reap the performance advantage. – Colonel Panic Jul 11 '12 at 22:16
  • 5
    Good to note that `.GroupBy(x => x.Index % 3)` will divide the entire collection up evenly into 3 parts, so if you have 30 items you will get 3 lists of 10. The current example gives you 10 lists of 3 if you have 30. – The Muffin Man Aug 27 '13 at 19:12
  • 12
    @Nick Order is not preserved your way though. It is still a good thing to know but you'd be grouping them into (0,3,6,9,...), (1,4,7,10,...), (2,5,8,11,...). If order doesn't matter then it is fine but in this case it sounds like it matters. – Reafexus Sep 05 '13 at 19:52
  • 1
    Check the use of MoreLinq in this post: http://stackoverflow.com/questions/13731796/create-batches-in-linq – pasx Mar 09 '15 at 23:19
  • 1
    Wouldn't this simple code `int i=0; return source.GroupBy(x => (i++/3)).ToList()` also work? It works fine for me. – Jihad Haddad Nov 26 '15 at 11:35
  • 1
    Cannot convert generic list> to generic ilist>? – Jjang Jan 25 '16 at 07:04
  • 3
    change `IList> Split(IList source)` to `IList> Split(IList source)` – Kai Feb 23 '16 at 06:54
  • I had good success using `.GroupBy(x => Math.Round(x.Index / chunkSize))` – kevp Apr 05 '16 at 20:50
  • on the last `.ToList();`: Error 45 Cannot implicitly convert type `System.Collections.Generic.List>` to `System.Collections.Generic.IList>`. An explicit conversion exists (are you missing a cast?) – serge Apr 19 '16 at 14:02
392

I just wrote this, and I think it's a little more elegant than the other proposed solutions:

/// <summary>
/// Break a list of items into chunks of a specific size
/// </summary>
public static IEnumerable<IEnumerable<T>> Chunk<T>(this IEnumerable<T> source, int chunksize)
{
    while (source.Any())
    {
        yield return source.Take(chunksize);
        source = source.Skip(chunksize);
    }
}
StayOnTarget
  • 11,743
  • 10
  • 52
  • 81
CaseyB
  • 4,245
  • 1
  • 16
  • 10
  • 16
    Love this solution. I'd recommend adding this sanity check to prevent an infinite loop: `if (chunksize <= 0) throw new ArgumentException("Chunk size must be greater than zero.", "chunksize");` – mroach Mar 01 '12 at 16:34
  • 14
    I like this, but it is not super efficient – Sam Saffron May 03 '12 at 04:26
  • I implemented my IQueryable<> chunking method that way, except I did a Count() to avoid n-1 .Any(). I guess the optimal implementation depends on the context for IQueryables (in my case hitting a database with LINQ to SQL). – Guillaume86 May 03 '12 at 16:51
  • 2
    For a 13 element sequence with chunk size of 3 this algorithm will access the elements 91 times. Not a problem for small sequences but very ineffecient for big. – Martin Liversage Aug 10 '12 at 10:28
  • 66
    I like this one but time efficiency is `O(n²)`. You can iterate through the list and get an `O(n)` time. – hIpPy Aug 30 '12 at 18:22
  • 10
    @hIpPy, how is it n^2? Looks linear to me – V Maharajh Oct 09 '13 at 21:02
  • 23
    @vivekmaharajh `source` is replaced by a wrapped `IEnumerable` each time. So taking elements from `source` goes through layers of `Skip`s – Lasse Espeholt Nov 24 '13 at 12:04
  • 5
    @hIpPy This will indeed be O(n^2) for delay executed `IEnumerable`s. However if you've got a plain old `array[]`, or a `List`, it'll be `O(n)`. If you like this solution and you've got an `IEnumerable`, you can use `.ToArray()` before passing it into `Chunk`, and get `O(n)`. Of course, the downsides are additional memory, and potential additional perf (if you have callers that may not enumerate all chunks, you'll unnecessarily enumerate all of `source`). Due to these disadvantages, manually doing the iteration yourself as @hlpPy suggests is likely the better option for production code. – V Maharajh Dec 04 '13 at 07:31
  • 8
    @vivekmaharajh: why do you think that list or array make it `0(n)`, they are not optimized and `Skip` and `Take` will always enumerate the sequence until that point so has `O(n^2)` complexity. On a very large list it Skip/Take apporaches are useless. I'd use the `GroupBy` with integer division or (more efficient) [MoreLINQ's `Batch`](https://code.google.com/p/morelinq/source/browse/MoreLinq/Batch.cs?r=f85495b139a19bce7df2be98ad88754ba8932a28). – Tim Schmelter Dec 11 '14 at 12:15
  • You cannot get O(n) for a simple reason: you may iterate the outer and inner enumerators in random order. Only improvement to this approach is comining the Any() and the Take(). – realbart Jul 22 '16 at 12:06
  • 3
    A similar method is now available in [.NET 6 / C#10](https://learn.microsoft.com/en-us/dotnet/api/system.linq.enumerable.chunk?view=net-6.0) – Luis Cantero Feb 27 '22 at 08:26
122

In general the approach suggested by CaseyB works fine, in fact if you are passing in a List<T> it is hard to fault it, perhaps I would change it to:

public static IEnumerable<IEnumerable<T>> ChunkTrivialBetter<T>(this IEnumerable<T> source, int chunksize)
{
   var pos = 0; 
   while (source.Skip(pos).Any())
   {
      yield return source.Skip(pos).Take(chunksize);
      pos += chunksize;
   }
}

Which will avoid massive call chains. Nonetheless, this approach has a general flaw. It materializes two enumerations per chunk, to highlight the issue try running:

foreach (var item in Enumerable.Range(1, int.MaxValue).Chunk(8).Skip(100000).First())
{
   Console.WriteLine(item);
}
// wait forever 

To overcome this we can try Cameron's approach, which passes the above test in flying colors as it only walks the enumeration once.

Trouble is that it has a different flaw, it materializes every item in each chunk, the trouble with that approach is that you run high on memory.

To illustrate that try running:

foreach (var item in Enumerable.Range(1, int.MaxValue)
               .Select(x => x + new string('x', 100000))
               .Clump(10000).Skip(100).First())
{
   Console.Write('.');
}
// OutOfMemoryException

Finally, any implementation should be able to handle out of order iteration of chunks, for example:

Enumerable.Range(1,3).Chunk(2).Reverse().ToArray()
// should return [3],[1,2]

Many highly optimal solutions like my first revision of this answer failed there. The same issue can be seen in casperOne's optimized answer.

To address all these issues you can use the following:

namespace ChunkedEnumerator
{
    public static class Extensions 
    {
        class ChunkedEnumerable<T> : IEnumerable<T>
        {
            class ChildEnumerator : IEnumerator<T>
            {
                ChunkedEnumerable<T> parent;
                int position;
                bool done = false;
                T current;


                public ChildEnumerator(ChunkedEnumerable<T> parent)
                {
                    this.parent = parent;
                    position = -1;
                    parent.wrapper.AddRef();
                }

                public T Current
                {
                    get
                    {
                        if (position == -1 || done)
                        {
                            throw new InvalidOperationException();
                        }
                        return current;

                    }
                }

                public void Dispose()
                {
                    if (!done)
                    {
                        done = true;
                        parent.wrapper.RemoveRef();
                    }
                }

                object System.Collections.IEnumerator.Current
                {
                    get { return Current; }
                }

                public bool MoveNext()
                {
                    position++;

                    if (position + 1 > parent.chunkSize)
                    {
                        done = true;
                    }

                    if (!done)
                    {
                        done = !parent.wrapper.Get(position + parent.start, out current);
                    }

                    return !done;

                }

                public void Reset()
                {
                    // per http://msdn.microsoft.com/en-us/library/system.collections.ienumerator.reset.aspx
                    throw new NotSupportedException();
                }
            }

            EnumeratorWrapper<T> wrapper;
            int chunkSize;
            int start;

            public ChunkedEnumerable(EnumeratorWrapper<T> wrapper, int chunkSize, int start)
            {
                this.wrapper = wrapper;
                this.chunkSize = chunkSize;
                this.start = start;
            }

            public IEnumerator<T> GetEnumerator()
            {
                return new ChildEnumerator(this);
            }

            System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
            {
                return GetEnumerator();
            }

        }

        class EnumeratorWrapper<T>
        {
            public EnumeratorWrapper (IEnumerable<T> source)
            {
                SourceEumerable = source;
            }
            IEnumerable<T> SourceEumerable {get; set;}

            Enumeration currentEnumeration;

            class Enumeration
            {
                public IEnumerator<T> Source { get; set; }
                public int Position { get; set; }
                public bool AtEnd { get; set; }
            }

            public bool Get(int pos, out T item) 
            {

                if (currentEnumeration != null && currentEnumeration.Position > pos)
                {
                    currentEnumeration.Source.Dispose();
                    currentEnumeration = null;
                }

                if (currentEnumeration == null)
                {
                    currentEnumeration = new Enumeration { Position = -1, Source = SourceEumerable.GetEnumerator(), AtEnd = false };
                }

                item = default(T);
                if (currentEnumeration.AtEnd)
                {
                    return false;
                }

                while(currentEnumeration.Position < pos) 
                {
                    currentEnumeration.AtEnd = !currentEnumeration.Source.MoveNext();
                    currentEnumeration.Position++;

                    if (currentEnumeration.AtEnd) 
                    {
                        return false;
                    }

                }

                item = currentEnumeration.Source.Current;

                return true;
            }

            int refs = 0;

            // needed for dispose semantics 
            public void AddRef()
            {
                refs++;
            }

            public void RemoveRef()
            {
                refs--;
                if (refs == 0 && currentEnumeration != null)
                {
                    var copy = currentEnumeration;
                    currentEnumeration = null;
                    copy.Source.Dispose();
                }
            }
        }

        public static IEnumerable<IEnumerable<T>> Chunk<T>(this IEnumerable<T> source, int chunksize)
        {
            if (chunksize < 1) throw new InvalidOperationException();

            var wrapper =  new EnumeratorWrapper<T>(source);

            int currentPos = 0;
            T ignore;
            try
            {
                wrapper.AddRef();
                while (wrapper.Get(currentPos, out ignore))
                {
                    yield return new ChunkedEnumerable<T>(wrapper, chunksize, currentPos);
                    currentPos += chunksize;
                }
            }
            finally
            {
                wrapper.RemoveRef();
            }
        }
    }

    class Program
    {
        static void Main(string[] args)
        {
            int i = 10;
            foreach (var group in Enumerable.Range(1, int.MaxValue).Skip(10000000).Chunk(3))
            {
                foreach (var n in group)
                {
                    Console.Write(n);
                    Console.Write(" ");
                }
                Console.WriteLine();
                if (i-- == 0) break;
            }


            var stuffs = Enumerable.Range(1, 10).Chunk(2).ToArray();

            foreach (var idx in new [] {3,2,1})
            {
                Console.Write("idx " + idx + " ");
                foreach (var n in stuffs[idx])
                {
                    Console.Write(n);
                    Console.Write(" ");
                }
                Console.WriteLine();
            }

            /*

10000001 10000002 10000003
10000004 10000005 10000006
10000007 10000008 10000009
10000010 10000011 10000012
10000013 10000014 10000015
10000016 10000017 10000018
10000019 10000020 10000021
10000022 10000023 10000024
10000025 10000026 10000027
10000028 10000029 10000030
10000031 10000032 10000033
idx 3 7 8
idx 2 5 6
idx 1 3 4
             */

            Console.ReadKey();


        }

    }
}

There is also a round of optimisations you could introduce for out-of-order iteration of chunks, which is out of scope here.

As to which method you should choose? It totally depends on the problem you are trying to solve. If you are not concerned with the first flaw the simple answer is incredibly appealing.

Note as with most methods, this is not safe for multi threading, stuff can get weird if you wish to make it thread safe you would need to amend EnumeratorWrapper.

Atif Aziz
  • 36,108
  • 16
  • 64
  • 74
Sam Saffron
  • 128,308
  • 78
  • 326
  • 506
  • Would the bug be Enumerable.Range(0, 100).Chunk(3).Reverse().ToArray() being wrong, or Enumerable.Range(0, 100).ToArray().Chunk(3).Reverse().ToArray() throwing an exception? – Cameron MacFarland May 03 '12 at 06:39
  • @SamSaffron I've updated my answer and simplified the code tremendously for what I feel is the prominent use case (and acknowledge the caveats). – casperOne May 03 '12 at 15:45
  • What about chuncking IQueryable<>? My guess is that a Take/Skip approach would be optimal if we want to delegate a maximum of the operations to the provider – Guillaume86 May 03 '12 at 16:45
  • @Guillaume86 I agree, if you have a IList or IQueryable you can take all sorts of shortcuts that would make this much faster (Linq does this internally for all sorts of other methods) – Sam Saffron May 03 '12 at 21:53
  • 1
    This is by far the best answer for efficiency. I'm having an issue using the SqlBulkCopy with an IEnumerable that runs additional processes on each column, so it must run through efficiently with only one pass. This will allow me to break up the IEnumerable into manageable sized chunks. (For those wondering, I did enable the SqlBulkCopy's streaming mode, which seems to be broken). – Brain2000 Mar 30 '16 at 19:23
  • This took the simple linq method 6s down to virtually 0s. Really fast! – Jerther Aug 11 '16 at 15:18
  • Building my own version without looking at yours, I think `Select` could be much more efficient if it skipped calling the `selector` function unless `Current` is called (admittedly causing side effects in `selector` to be ignored - perhaps `selector` could be evaluated for non-parameter references). – NetMage May 02 '18 at 18:37
72

You could use a number of queries that use Take and Skip, but that would add too many iterations on the original list, I believe.

Rather, I think you should create an iterator of your own, like so:

public static IEnumerable<IEnumerable<T>> GetEnumerableOfEnumerables<T>(
  IEnumerable<T> enumerable, int groupSize)
{
   // The list to return.
   List<T> list = new List<T>(groupSize);

   // Cycle through all of the items.
   foreach (T item in enumerable)
   {
     // Add the item.
     list.Add(item);

     // If the list has the number of elements, return that.
     if (list.Count == groupSize)
     {
       // Return the list.
       yield return list;

       // Set the list to a new list.
       list = new List<T>(groupSize);
     }
   }

   // Return the remainder if there is any,
   if (list.Count != 0)
   {
     // Return the list.
     yield return list;
   }
}

You can then call this and it is LINQ enabled so you can perform other operations on the resulting sequences.


In light of Sam's answer, I felt there was an easier way to do this without:

  • Iterating through the list again (which I didn't do originally)
  • Materializing the items in groups before releasing the chunk (for large chunks of items, there would be memory issues)
  • All of the code that Sam posted

That said, here's another pass, which I've codified in an extension method to IEnumerable<T> called Chunk:

public static IEnumerable<IEnumerable<T>> Chunk<T>(this IEnumerable<T> source, 
    int chunkSize)
{
    // Validate parameters.
    if (source == null) throw new ArgumentNullException(nameof(source));
    if (chunkSize <= 0) throw new ArgumentOutOfRangeException(nameof(chunkSize),
        "The chunkSize parameter must be a positive value.");

    // Call the internal implementation.
    return source.ChunkInternal(chunkSize);
}

Nothing surprising up there, just basic error checking.

Moving on to ChunkInternal:

private static IEnumerable<IEnumerable<T>> ChunkInternal<T>(
    this IEnumerable<T> source, int chunkSize)
{
    // Validate parameters.
    Debug.Assert(source != null);
    Debug.Assert(chunkSize > 0);

    // Get the enumerator.  Dispose of when done.
    using (IEnumerator<T> enumerator = source.GetEnumerator())
    do
    {
        // Move to the next element.  If there's nothing left
        // then get out.
        if (!enumerator.MoveNext()) yield break;

        // Return the chunked sequence.
        yield return ChunkSequence(enumerator, chunkSize);
    } while (true);
}

Basically, it gets the IEnumerator<T> and manually iterates through each item. It checks to see if there any items currently to be enumerated. After each chunk is enumerated through, if there aren't any items left, it breaks out.

Once it detects there are items in the sequence, it delegates the responsibility for the inner IEnumerable<T> implementation to ChunkSequence:

private static IEnumerable<T> ChunkSequence<T>(IEnumerator<T> enumerator, 
    int chunkSize)
{
    // Validate parameters.
    Debug.Assert(enumerator != null);
    Debug.Assert(chunkSize > 0);

    // The count.
    int count = 0;

    // There is at least one item.  Yield and then continue.
    do
    {
        // Yield the item.
        yield return enumerator.Current;
    } while (++count < chunkSize && enumerator.MoveNext());
}

Since MoveNext was already called on the IEnumerator<T> passed to ChunkSequence, it yields the item returned by Current and then increments the count, making sure never to return more than chunkSize items and moving to the next item in the sequence after every iteration (but short-circuited if the number of items yielded exceeds the chunk size).

If there are no items left, then the InternalChunk method will make another pass in the outer loop, but when MoveNext is called a second time, it will still return false, as per the documentation (emphasis mine):

If MoveNext passes the end of the collection, the enumerator is positioned after the last element in the collection and MoveNext returns false. When the enumerator is at this position, subsequent calls to MoveNext also return false until Reset is called.

At this point, the loop will break, and the sequence of sequences will terminate.

This is a simple test:

static void Main()
{
    string s = "agewpsqfxyimc";

    int count = 0;

    // Group by three.
    foreach (IEnumerable<char> g in s.Chunk(3))
    {
        // Print out the group.
        Console.Write("Group: {0} - ", ++count);

        // Print the items.
        foreach (char c in g)
        {
            // Print the item.
            Console.Write(c + ", ");
        }

        // Finish the line.
        Console.WriteLine();
    }
}

Output:

Group: 1 - a, g, e,
Group: 2 - w, p, s,
Group: 3 - q, f, x,
Group: 4 - y, i, m,
Group: 5 - c,

An important note, this will not work if you don't drain the entire child sequence or break at any point in the parent sequence. This is an important caveat, but if your use case is that you will consume every element of the sequence of sequences, then this will work for you.

Additionally, it will do strange things if you play with the order, just as Sam's did at one point.

Duc Filan
  • 6,769
  • 3
  • 21
  • 26
casperOne
  • 73,706
  • 19
  • 184
  • 253
  • I think this is the best solution... the only problem is that list doesn't have Length... it has Count. But that's easy to change. We can make this better by not even constructing Lists but returning ienumerables that contain references to the main list with a offset/length combination. So then, if the groupsize is big, we don't waste memory. Comment if you want me to write it up. – Amir Nov 18 '09 at 04:39
  • @Amir i'd like to see that written up – samandmoore May 31 '11 at 15:38
  • 1
    This is nice and fast - Cameron posted a very similar one as well after yours, only caveat is that it buffers chunks, this can lead to out-of-memory if chunks and item sizes are big. See my answer for an alternative, albeit much hairier, answer. – Sam Saffron May 03 '12 at 11:59
  • @SamSaffron Yeah, if you have a large number of items in the `List`, you are obviously going to have memory issues because of the buffering. In retrospect, I should have noted that in the answer, but it seemed at the time the focus was on too many iterations. That said, your solution is indeed hairier. I haven't tested it, but now it has me wondering if there's a less hairy solution. – casperOne May 03 '12 at 12:05
  • @casperOne yeah ... Google gave me this page when I was searching for a way to split up enumerables, for my specific use case I am splitting an insanely big list of records that are returned from the db, if I materialize them to a list it would blow up (in fact dapper has a buffer:false option just for this use case) – Sam Saffron May 03 '12 at 12:11
  • curious to see a cleaner solution that maintains the same perf, it is a very interesting problem – Sam Saffron May 03 '12 at 12:13
  • @SamSaffron I'll do the work on it today. Do you need thread-safety? I have run into that problem about bringing things in from the database a long time ago. Chunking in the database (if your tables allow for it) helps tremendously (I have a super-deep table with hundreds of millions of rows which I have natural partitions on so I can chunk on the database level and not have queries go nuts). – casperOne May 03 '12 at 12:20
  • Thread safety is really not a problem I need to solve, my solution would easily be adapted to a thread safe one quite easily, but multi threading would probably degrade perf quite a bit unless some other adjustments are made, found this problem quite interesting which is why I spent an hour or two on it. – Sam Saffron May 03 '12 at 12:28
  • initially I did pass enumerators around and my implementation was way simpler, but `Enumerable.Range(0, 100).Chunk(3).Reverse().ToArray()` which was really tricky to fix – Sam Saffron May 03 '12 at 21:52
  • I originally needed thread safety as my linq query was context.Select(GetJobData).Clump(10).AsParallel().Select(ProcessJobs) and I needed to clump the jobs so the thread was running for a reasonable length of time instead of massive thread switching. :P – Cameron MacFarland May 05 '12 at 02:13
  • This is the clear winner in terms of the efficiency / simplicity trade-off. Way too many comments though :) – Ohad Schneider Aug 24 '17 at 08:54
  • Just threw this code (the updated section) into a project to test and it just returns nulls... I am finding the one by Marc Andre seems to be working, along with the one from JaredPar. – Kevin Cook Dec 13 '21 at 21:10
59

Ok, here's my take on it:

  • completely lazy: works on infinite enumerables
  • no intermediate copying/buffering
  • O(n) execution time
  • works also when inner sequences are only partially consumed

public static IEnumerable<IEnumerable<T>> Chunks<T>(this IEnumerable<T> enumerable,
                                                    int chunkSize)
{
    if (chunkSize < 1) throw new ArgumentException("chunkSize must be positive");

    using (var e = enumerable.GetEnumerator())
    while (e.MoveNext())
    {
        var remaining = chunkSize;    // elements remaining in the current chunk
        var innerMoveNext = new Func<bool>(() => --remaining > 0 && e.MoveNext());

        yield return e.GetChunk(innerMoveNext);
        while (innerMoveNext()) {/* discard elements skipped by inner iterator */}
    }
}

private static IEnumerable<T> GetChunk<T>(this IEnumerator<T> e,
                                          Func<bool> innerMoveNext)
{
    do yield return e.Current;
    while (innerMoveNext());
}

Example Usage

var src = new [] {1, 2, 3, 4, 5, 6}; 

var c3 = src.Chunks(3);      // {{1, 2, 3}, {4, 5, 6}}; 
var c4 = src.Chunks(4);      // {{1, 2, 3, 4}, {5, 6}}; 

var sum   = c3.Select(c => c.Sum());    // {6, 15}
var count = c3.Count();                 // 2
var take2 = c3.Select(c => c.Take(2));  // {{1, 2}, {4, 5}}

Explanations

The code works by nesting two yield based iterators.

The outer iterator must keep track of how many elements have been effectively consumed by the inner (chunk) iterator. This is done by closing over remaining with innerMoveNext(). Unconsumed elements of a chunk are discarded before the next chunk is yielded by the outer iterator. This is necessary because otherwise you get inconsistent results, when the inner enumerables are not (completely) consumed (e.g. c3.Count() would return 6).

Note: The answer has been updated to address the shortcomings pointed out by @aolszowka.

3dGrabber
  • 4,710
  • 1
  • 34
  • 42
  • 3
    Very nice. My "correct" solution was way more complicated than that. This is the #1 answer IMHO. – CaseyB Jan 08 '14 at 21:37
  • 1
    This suffers from unexpected (from an API standpoint) behavior when ToArray() is called, it is also not thread-safe. – aolszowka May 13 '14 at 13:57
  • @aolszowka: could you please elaborate? – 3dGrabber May 14 '14 at 08:40
  • @3dGrabber Perhaps it was how I re-factored your code (sorry its a bit too long to past here, basically instead of an extension method I passed in the sourceEnumerator). The test case I used was something to this effect: int[] arrayToSort = new int[] { 9, 7, 2, 6, 3, 4, 8, 5, 1, 10, 11, 12, 13 }; var source = Chunkify(arrayToSort, 3).ToArray(); Resulted in Source indicating that there were 13 chunks (the number of elements). This made sense to me as unless you queried the inner enumerations the Enumerator was not incremented. – aolszowka May 14 '14 at 14:40
  • @3dGrabber (continued) Furthermore, because the inner enumerations are not generated until accessed, attempting to perform some type of threaded operation across them could result in a race condition in which a thread calls sourceEnumerator.MoveNext() prior to another thread reading sourceEnumerator.Current. At least that's what I came up with, if I'm wrong please let me know, I'm willing to learn! – aolszowka May 14 '14 at 14:49
  • 1
    @aolszowka: very valid points. I've added a warning and a usage section. The code assumes that you iterate over the inner enumerable. With your solution you forfeit the laziness though. I think it should be possible to get the best of both worlds with a custom, caching IEnumerator. If I find a solution I'll post it here... – 3dGrabber May 20 '14 at 09:10
  • yield return enumerator.GetChunk(chunkSize).ToArray(); is the simple solution at the cost of buffering. The other option is to throw an exception when the previous inner enumerable isn't exhausted when the outer is next yielded. (I suspect you're thinking about buffering only if required; that'd be cool too.) – CaseyB May 21 '14 at 14:10
  • 1
    @3dGrabber I'm trying to use this (because elegant) for the non-lazy case to split larger collections of complex objects (basically, get and .ToList()), but cannot seem to get it to return more than the first chunk. No custom enumerator. Realizing this is vague, any idea why that might happen with a straight (non-generic) copy of this? – downwitch Jul 26 '16 at 18:24
  • @3dGrabber never mind, sorry for the disturbance. In my case, I expanded innerMoveNext to do some logging, and was able to see that the call to `while (innerMoveNext()) {/* discard elements skipped by inner iterator */}` was the problem--as soon as I removed that, it worked for me. I am not sure I am familiar enough with your source well enough to know whether that line might serve some purpose, but in my case it seemed redundant to the enumerating in `GetChunk`. – downwitch Jul 27 '16 at 17:42
  • Having spent some quality time with this (and ignoring my previous/hidden comments), I agree with CaseyB that this is a great, elegant, (ultimately) simple solution. But I maintain that the `discard elements...` commented line serves no measurable purpose (remaining will always be <0, because of the do/while in the "inner" loop), and it confuses the issue--try changing the `int chunkSize` to a `uint`, and enjoy the party. (Otherwise, if you "partially consume", then you stop before that line anyway.) Just in case people lose the day or two I did trying to make sense of it. – downwitch Aug 04 '16 at 21:56
  • To see the purpose of `while (innerMoveNext())`, comment it out and run `c3.Count()` from the examples. – 3dGrabber Aug 15 '16 at 10:08
  • Unfortunately, the last usage example doesn't work for me: `var take2 = c3.Select(c => c.Take(2));` It returns `[[6], [6]]`. If I modify it by appending ToList() inside the Select(), then I get the expected result. Still, this is very useful! – Gyromite Aug 16 '19 at 20:02
  • Somewhere along the way the code must have been changed because this does not work. – Kevin Cook Dec 13 '21 at 20:30
  • @KevinCook it works however has some limitations. You shouldn't convert the `IEnumerable` to an array or a list like `src.Chunks(3).ToArray()` or `.ToList()`. Keep it `IEnumerable` to keep it working. – oleksa Dec 29 '21 at 10:46
57

Update .NET 6.0

.NET 6.0 added a new native Chunk method to the System.Linq namespace:

public static System.Collections.Generic.IEnumerable<TSource[]> Chunk<TSource> (
   this System.Collections.Generic.IEnumerable<TSource> source, int size);

Using this new method every chunk except the last will be of size size. The last chunk will contain the remaining elements and may be of a smaller size.

Here is an example:

var list = Enumerable.Range(1, 100);
var chunkSize = 10;

foreach(var chunk in list.Chunk(chunkSize)) //Returns a chunk with the correct size. 
{
    Parallel.ForEach(chunk, (item) =>
    {
        //Do something Parallel here. 
        Console.WriteLine(item);
    });
}

You’re probably thinking, well why not use Skip and Take? Which is true, I think this is just a bit more concise and makes things just that little bit more readable.

KyleMit
  • 30,350
  • 66
  • 462
  • 664
Majid Shahabfar
  • 4,010
  • 2
  • 28
  • 36
  • 1
    In case you're stuck on an older framework, the .NET source code is available here: https://github.com/dotnet/runtime/blob/main/src/libraries/System.Linq/src/System/Linq/Chunk.cs The implementation is very concise, and seems to have parity with Sam Saffron's answer, which is what I had been using prior to .NET 6. – Gyromite Feb 28 '23 at 04:54
18

completely lazy, no counting or copying:

public static class EnumerableExtensions
{

  public static IEnumerable<IEnumerable<T>> Split<T>(this IEnumerable<T> source, int len)
  {
     if (len == 0)
        throw new ArgumentNullException();

     var enumer = source.GetEnumerator();
     while (enumer.MoveNext())
     {
        yield return Take(enumer.Current, enumer, len);
     }
  }

  private static IEnumerable<T> Take<T>(T head, IEnumerator<T> tail, int len)
  {
     while (true)
     {
        yield return head;
        if (--len == 0)
           break;
        if (tail.MoveNext())
           head = tail.Current;
        else
           break;
     }
  }
}
xtofs
  • 421
  • 6
  • 8
  • This solution is so elegant that I'm sorry that I cannot upvote this answer more than once. – Mark Apr 10 '15 at 14:21
  • 3
    I don't think this would ever fail, exactly. But it could certainly have some odd behaviour. If you had 100 items, and you split into batches of 10, and you enumerated all the batches without enumerating any items of those batches, you'd end up with 100 batches of 1. – CaseyB Apr 15 '15 at 20:58
  • 1
    As @CaseyB mentioned, this suffers from the same failing 3dGrabber addressed here http://stackoverflow.com/a/20953521/1037948, but man is it fast! – drzaus Jun 10 '15 at 18:04
  • 1
    This is a beautiful solution. Does exactly what it promises. – Rod Hartzell Feb 16 '16 at 22:59
  • By far the most elegant and to the point solution. Only thing is, you should add a check for negative numbers, and replace the ArgumentNullException by a ArgumentException – Romain Vergnory Apr 02 '19 at 14:45
14

I think the following suggestion would be the fastest. I am sacrificing the lazyness of the source Enumerable for the ability to use Array.Copy and knowing ahead of the time the length of each of my sublists.

public static IEnumerable<T[]> Chunk<T>(this IEnumerable<T> items, int size)
{
    T[] array = items as T[] ?? items.ToArray();
    for (int i = 0; i < array.Length; i+=size)
    {
        T[] chunk = new T[Math.Min(size, array.Length - i)];
        Array.Copy(array, i, chunk, 0, chunk.Length);
        yield return chunk;
    }
}
  • Not just fastest, it also correctly handles further enumerable operations on the result, i.e. items.Chunk(5).Reverse().SelectMany(x => x) – too Dec 16 '19 at 21:40
12

For anyone interested in a packaged/maintained solution, the MoreLINQ library provides the Batch extension method which matches your requested behavior:

IEnumerable<char> source = "Example string";
IEnumerable<IEnumerable<char>> chunksOfThreeChars = source.Batch(3);

The Batch implementation is similar to Cameron MacFarland's answer, with the addition of an overload for transforming the chunk/batch before returning, and performs quite well.

Kevinoid
  • 4,180
  • 40
  • 25
  • 5
    this should be the accepted answer. Instead of reinventing the wheel, morelinq should be used – Otabek Kholikov Sep 16 '19 at 12:41
  • 2
    Indeed. Checked the source code on github, it is superior to anything on this page. Including my answer :) I did initially check moreLinq, but I was looking something with "Chunk" in its name. – Zar Shardan Jun 21 '20 at 09:28
  • This was by far the simplest, easiest, and fastest to implement solution for me. This should be the top answer, it seems like other people got caught up in leetcoding this one instead of going for the simplest solution. – J_L Apr 19 '22 at 00:04
11

I wrote a Clump extension method several years ago. Works great, and is the fastest implementation here. :P

/// <summary>
/// Clumps items into same size lots.
/// </summary>
/// <typeparam name="T"></typeparam>
/// <param name="source">The source list of items.</param>
/// <param name="size">The maximum size of the clumps to make.</param>
/// <returns>A list of list of items, where each list of items is no bigger than the size given.</returns>
public static IEnumerable<IEnumerable<T>> Clump<T>(this IEnumerable<T> source, int size)
{
    if (source == null)
        throw new ArgumentNullException("source");
    if (size < 1)
        throw new ArgumentOutOfRangeException("size", "size must be greater than 0");

    return ClumpIterator<T>(source, size);
}

private static IEnumerable<IEnumerable<T>> ClumpIterator<T>(IEnumerable<T> source, int size)
{
    Debug.Assert(source != null, "source is null.");

    T[] items = new T[size];
    int count = 0;
    foreach (var item in source)
    {
        items[count] = item;
        count++;

        if (count == size)
        {
            yield return items;
            items = new T[size];
            count = 0;
        }
    }
    if (count > 0)
    {
        if (count == size)
            yield return items;
        else
        {
            T[] tempItems = new T[count];
            Array.Copy(items, tempItems, count);
            yield return tempItems;
        }
    }
}
Cameron MacFarland
  • 70,676
  • 20
  • 104
  • 133
9

System.Interactive provides Buffer() for this purpose. Some quick testing shows performance is similar to Sam's solution.

dahlbyk
  • 75,175
  • 8
  • 100
  • 122
  • 1
    do you know the buffering semantics? eg: if you have an enumerator that spits out strings that are 300k big and try to split it into 10,000 size chunks will you get an out of memory? – Sam Saffron May 03 '12 at 11:52
  • `Buffer()` returns `IEnumerable>` so yeah, you'd probably have a problem there - it doesn't stream like yours. – dahlbyk May 03 '12 at 19:53
  • Yeah but if you want streaming then use the Observable Buffer method instead in the same repo (Rx.NET) – Frank Sep 15 '21 at 10:20
9

We can improve @JaredPar's solution to do true lazy evaluation. We use a GroupAdjacentBy method that yields groups of consecutive elements with the same key:

sequence
.Select((x, i) => new { Value = x, Index = i })
.GroupAdjacentBy(x=>x.Index/3)
.Select(g=>g.Select(x=>x.Value))

Because the groups are yielded one-by-one, this solution works efficiently with long or infinite sequences.

Colonel Panic
  • 132,665
  • 89
  • 401
  • 465
7

I find this little snippet does the job quite nicely.

public static IEnumerable<List<T>> Chunked<T>(this List<T> source, int chunkSize)
{
    var offset = 0;

    while (offset < source.Count)
    {
        yield return source.GetRange(offset, Math.Min(source.Count - offset, chunkSize));
        offset += chunkSize;
    }
}
Vivek Jain
  • 3,811
  • 6
  • 30
  • 47
erlando
  • 6,736
  • 4
  • 24
  • 29
6

Here's a list splitting routine I wrote a couple months ago:

public static List<List<T>> Chunk<T>(
    List<T> theList,
    int chunkSize
)
{
    List<List<T>> result = theList
        .Select((x, i) => new {
            data = x,
            indexgroup = i / chunkSize
        })
        .GroupBy(x => x.indexgroup, x => x.data)
        .Select(g => new List<T>(g))
        .ToList();

    return result;
}
Amy B
  • 108,202
  • 21
  • 135
  • 185
5

What about this one?

var input = new List<string> { "a", "g", "e", "w", "p", "s", "q", "f", "x", "y", "i", "m", "c" };
var k = 3

var res = Enumerable.Range(0, (input.Count - 1) / k + 1)
                    .Select(i => input.GetRange(i * k, Math.Min(k, input.Count - i * k)))
                    .ToList();

As far as I know, GetRange() is linear in terms of number of items taken. So this should perform well.

Roman Pekar
  • 107,110
  • 28
  • 195
  • 197
5

This is an old question but this is what I ended up with; it enumerates the enumerable only once, but does create lists for each of the partitions. It doesn't suffer from unexpected behavior when ToArray() is called as some of the implementations do:

    public static IEnumerable<IEnumerable<T>> Partition<T>(IEnumerable<T> source, int chunkSize)
    {
        if (source == null)
        {
            throw new ArgumentNullException("source");
        }

        if (chunkSize < 1)
        {
            throw new ArgumentException("Invalid chunkSize: " + chunkSize);
        }

        using (IEnumerator<T> sourceEnumerator = source.GetEnumerator())
        {
            IList<T> currentChunk = new List<T>();
            while (sourceEnumerator.MoveNext())
            {
                currentChunk.Add(sourceEnumerator.Current);
                if (currentChunk.Count == chunkSize)
                {
                    yield return currentChunk;
                    currentChunk = new List<T>();
                }
            }

            if (currentChunk.Any())
            {
                yield return currentChunk;
            }
        }
    }
aolszowka
  • 1,300
  • 12
  • 36
  • Would be good to convert this into a Extension method: `public static IEnumerable> Partition(this IEnumerable source, int chunkSize)` – krizzzn Jun 24 '14 at 14:30
  • +1 for your answer. However i recommend two things 1. use foreach instead of while and using block. 2. Pass chunkSize in the constructor of List so that list knows its maximum expected size. – Usman Zafar Jul 21 '14 at 07:26
5

We found David B's solution worked the best. But we adapted it to a more general solution:

list.GroupBy(item => item.SomeProperty) 
   .Select(group => new List<T>(group)) 
   .ToArray();
mwjackson
  • 5,403
  • 10
  • 53
  • 58
5

Old code, but this is what I've been using:

    public static IEnumerable<List<T>> InSetsOf<T>(this IEnumerable<T> source, int max)
    {
        var toReturn = new List<T>(max);
        foreach (var item in source)
        {
            toReturn.Add(item);
            if (toReturn.Count == max)
            {
                yield return toReturn;
                toReturn = new List<T>(max);
            }
        }
        if (toReturn.Any())
        {
            yield return toReturn;
        }
    }
Robert McKee
  • 21,305
  • 1
  • 43
  • 57
  • After posting, I realized this is pretty much exactly the same code casperOne posted 6 years ago with the change of using .Any() instead of .Count() as I don't need the entire count, just need to know if any exist. – Robert McKee Feb 25 '15 at 09:47
4

This following solution is the most compact I could come up with that is O(n).

public static IEnumerable<T[]> Chunk<T>(IEnumerable<T> source, int chunksize)
{
    var list = source as IList<T> ?? source.ToList();
    for (int start = 0; start < list.Count; start += chunksize)
    {
        T[] chunk = new T[Math.Min(chunksize, list.Count - start)];
        for (int i = 0; i < chunk.Length; i++)
            chunk[i] = list[start + i];

        yield return chunk;
    }
}
3

It's an old solution but I had a different approach. I use Skip to move to desired offset and Take to extract desired number of elements:

public static IEnumerable<IEnumerable<T>> Chunk<T>(this IEnumerable<T> source, 
                                                   int chunkSize)
{
    if (chunkSize <= 0)
        throw new ArgumentOutOfRangeException($"{nameof(chunkSize)} should be > 0");

    var nbChunks = (int)Math.Ceiling((double)source.Count()/chunkSize);

    return Enumerable.Range(0, nbChunks)
                     .Select(chunkNb => source.Skip(chunkNb*chunkSize)
                     .Take(chunkSize));
}
ΩmegaMan
  • 29,542
  • 12
  • 100
  • 122
Bertrand
  • 601
  • 1
  • 7
  • 10
  • 1
    Very similar to an approach I used, but I recommend that source not be IEnumerable. For example, if source is the result of a LINQ query the Skip/Take would trigger nbChunk enumerations of the query. Could get expensive. Better would be to use IList or ICollection as the type for source. That avoids the problem altogether. – RB Davidson Apr 27 '18 at 12:25
3

If the list is of type system.collections.generic you can use the "CopyTo" method available to copy elements of your array to other sub arrays. You specify the start element and number of elements to copy.

You could also make 3 clones of your original list and use the "RemoveRange" on each list to shrink the list to the size you want.

Or just create a helper method to do it for you.

Jobo
  • 940
  • 6
  • 11
2

Another way is using Rx Buffer operator

//using System.Linq;
//using System.Reactive.Linq;
//using System.Reactive.Threading.Tasks;

var observableBatches = anAnumerable.ToObservable().Buffer(size);

var batches = aList.ToObservable().Buffer(size).ToList().ToTask().GetAwaiter().GetResult();
frhack
  • 4,862
  • 2
  • 28
  • 25
2

The question was how to "Split List into Sublists with LINQ", but sometimes you may want those sub-lists to be references to the original list, not copies. This allows you to modify the original list from the sub-lists. In that case, this may work for you.

public static IEnumerable<Memory<T>> RefChunkBy<T>(this T[] array, int size)
{
    if (size < 1 || array is null)
    {
        throw new ArgumentException("chunkSize must be positive");
    }

    var index = 0;
    var counter = 0;

    for (int i = 0; i < array.Length; i++)
    {
        if (counter == size)
        {
            yield return new Memory<T>(array, index, size);
            index = i;
            counter = 0;
        }
        counter++;

        if (i + 1 == array.Length)
        {
            yield return new Memory<T>(array, index, array.Length - index);
        }
    }
}

Usage:

var src = new[] { 1, 2, 3, 4, 5, 6 };

var c3 = RefChunkBy(src, 3);      // {{1, 2, 3}, {4, 5, 6}};
var c4 = RefChunkBy(src, 4);      // {{1, 2, 3, 4}, {5, 6}};

// as extension method
var c3 = src.RefChunkBy(3);      // {{1, 2, 3}, {4, 5, 6}};
var c4 = src.RefChunkBy(4);      // {{1, 2, 3, 4}, {5, 6}};

var sum = c3.Select(c => c.Span.ToArray().Sum());    // {6, 15}
var count = c3.Count();                 // 2
var take2 = c3.Select(c => c.Span.ToArray().Take(2));  // {{1, 2}, {4, 5}}

Feel free to make this code better.

dek
  • 85
  • 1
  • 7
1

Using modular partitioning:

public IEnumerable<IEnumerable<string>> Split(IEnumerable<string> input, int chunkSize)
{
    var chunks = (int)Math.Ceiling((double)input.Count() / (double)chunkSize);
    return Enumerable.Range(0, chunks).Select(id => input.Where(s => s.GetHashCode() % chunks == id));
}
nawfal
  • 70,104
  • 56
  • 326
  • 368
Janosz G.
  • 11
  • 1
1

Just putting in my two cents. If you wanted to "bucket" the list (visualize left to right), you could do the following:

 public static List<List<T>> Buckets<T>(this List<T> source, int numberOfBuckets)
    {
        List<List<T>> result = new List<List<T>>();
        for (int i = 0; i < numberOfBuckets; i++)
        {
            result.Add(new List<T>());
        }

        int count = 0;
        while (count < source.Count())
        {
            var mod = count % numberOfBuckets;
            result[mod].Add(source[count]);
            count++;
        }
        return result;
    }
mattylantz
  • 312
  • 4
  • 10
1
public static List<List<T>> GetSplitItemsList<T>(List<T> originalItemsList, short number)
    {
        var listGroup = new List<List<T>>();
        int j = number;
        for (int i = 0; i < originalItemsList.Count; i += number)
        {
            var cList = originalItemsList.Take(j).Skip(i).ToList();
            j += number;
            listGroup.Add(cList);
        }
        return listGroup;
    }
Joy Zhu
  • 446
  • 4
  • 9
1

From .NET 6 you can now use native Chunk() method, available for both IEnumerable<T> and IQueryable<T>.

More info (and links) here: https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/concepts/linq/partitioning-data

paradise
  • 336
  • 2
  • 15
0

To insert my two cents...

By using the list type for the source to be chunked, I found another very compact solution:

public static IEnumerable<IEnumerable<TSource>> Chunk<TSource>(this IEnumerable<TSource> source, int chunkSize)
{
    // copy the source into a list
    var chunkList = source.ToList();

    // return chunks of 'chunkSize' items
    while (chunkList.Count > chunkSize)
    {
        yield return chunkList.GetRange(0, chunkSize);
        chunkList.RemoveRange(0, chunkSize);
    }

    // return the rest
    yield return chunkList;
}
Patrick
  • 668
  • 4
  • 11
0

I took the primary answer and made it to be an IOC container to determine where to split. (For who is really looking to only split on 3 items, in reading this post while searching for an answer?)

This method allows one to split on any type of item as needed.

public static List<List<T>> SplitOn<T>(List<T> main, Func<T, bool> splitOn)
{
    int groupIndex = 0;

    return main.Select( item => new 
                             { 
                               Group = (splitOn.Invoke(item) ? ++groupIndex : groupIndex), 
                               Value = item 
                             })
                .GroupBy( it2 => it2.Group)
                .Select(x => x.Select(v => v.Value).ToList())
                .ToList();
}

So for the OP the code would be

var it = new List<string>()
                       { "a", "g", "e", "w", "p", "s", "q", "f", "x", "y", "i", "m", "c" };

int index = 0; 
var result = SplitOn(it, (itm) => (index++ % 3) == 0 );
ΩmegaMan
  • 29,542
  • 12
  • 100
  • 122
0

So performatic as the Sam Saffron's approach.

public static IEnumerable<IEnumerable<T>> Batch<T>(this IEnumerable<T> source, int size)
{
    if (source == null) throw new ArgumentNullException(nameof(source));
    if (size <= 0) throw new ArgumentOutOfRangeException(nameof(size), "Size must be greater than zero.");

    return BatchImpl(source, size).TakeWhile(x => x.Any());
}

static IEnumerable<IEnumerable<T>> BatchImpl<T>(this IEnumerable<T> source, int size)
{
    var values = new List<T>();
    var group = 1;
    var disposed = false;
    var e = source.GetEnumerator();

    try
    {
        while (!disposed)
        {
            yield return GetBatch(e, values, group, size, () => { e.Dispose(); disposed = true; });
            group++;
        }
    }
    finally
    {
        if (!disposed)
            e.Dispose();
    }
}

static IEnumerable<T> GetBatch<T>(IEnumerator<T> e, List<T> values, int group, int size, Action dispose)
{
    var min = (group - 1) * size + 1;
    var max = group * size;
    var hasValue = false;

    while (values.Count < min && e.MoveNext())
    {
        values.Add(e.Current);
    }

    for (var i = min; i <= max; i++)
    {
        if (i <= values.Count)
        {
            hasValue = true;
        }
        else if (hasValue = e.MoveNext())
        {
            values.Add(e.Current);
        }
        else
        {
            dispose();
        }

        if (hasValue)
            yield return values[i - 1];
        else
            yield break;
    }
}

}

leandromoh
  • 139
  • 6
0

Can work with infinite generators:

a.Zip(a.Skip(1), (x, y) => Enumerable.Repeat(x, 1).Concat(Enumerable.Repeat(y, 1)))
 .Zip(a.Skip(2), (xy, z) => xy.Concat(Enumerable.Repeat(z, 1)))
 .Where((x, i) => i % 3 == 0)

Demo code: https://ideone.com/GKmL7M

using System;
using System.Collections.Generic;
using System.Linq;

public class Test
{
  private static void DoIt(IEnumerable<int> a)
  {
    Console.WriteLine(String.Join(" ", a));

    foreach (var x in a.Zip(a.Skip(1), (x, y) => Enumerable.Repeat(x, 1).Concat(Enumerable.Repeat(y, 1))).Zip(a.Skip(2), (xy, z) => xy.Concat(Enumerable.Repeat(z, 1))).Where((x, i) => i % 3 == 0))
      Console.WriteLine(String.Join(" ", x));

    Console.WriteLine();
  }

  public static void Main()
  {
    DoIt(new int[] {1});
    DoIt(new int[] {1, 2});
    DoIt(new int[] {1, 2, 3});
    DoIt(new int[] {1, 2, 3, 4});
    DoIt(new int[] {1, 2, 3, 4, 5});
    DoIt(new int[] {1, 2, 3, 4, 5, 6});
  }
}
1

1 2

1 2 3
1 2 3

1 2 3 4
1 2 3

1 2 3 4 5
1 2 3

1 2 3 4 5 6
1 2 3
4 5 6

But actually I would prefer to write corresponding method without linq.

Qwertiy
  • 19,681
  • 15
  • 61
  • 128
0

Check this out! I have a list of elements with a sequence counter and date. For each time the sequence restarts, I want to create a new list.

Ex. list of messages.

 List<dynamic> messages = new List<dynamic>
        {
            new { FcntUp = 101, CommTimestamp = "2019-01-01 00:00:01" },
            new { FcntUp = 102, CommTimestamp = "2019-01-01 00:00:02" },
            new { FcntUp = 103, CommTimestamp = "2019-01-01 00:00:03" },

            //restart of sequence
            new { FcntUp = 1, CommTimestamp = "2019-01-01 00:00:04" },
            new { FcntUp = 2, CommTimestamp = "2019-01-01 00:00:05" },
            new { FcntUp = 3, CommTimestamp = "2019-01-01 00:00:06" },

            //restart of sequence
            new { FcntUp = 1, CommTimestamp = "2019-01-01 00:00:07" },
            new { FcntUp = 2, CommTimestamp = "2019-01-01 00:00:08" },
            new { FcntUp = 3, CommTimestamp = "2019-01-01 00:00:09" }
        };

I want to split the list into separate lists as the counter restarts. Here is the code:

var arraylist = new List<List<dynamic>>();

        List<dynamic> messages = new List<dynamic>
        {
            new { FcntUp = 101, CommTimestamp = "2019-01-01 00:00:01" },
            new { FcntUp = 102, CommTimestamp = "2019-01-01 00:00:02" },
            new { FcntUp = 103, CommTimestamp = "2019-01-01 00:00:03" },

            //restart of sequence
            new { FcntUp = 1, CommTimestamp = "2019-01-01 00:00:04" },
            new { FcntUp = 2, CommTimestamp = "2019-01-01 00:00:05" },
            new { FcntUp = 3, CommTimestamp = "2019-01-01 00:00:06" },

            //restart of sequence
            new { FcntUp = 1, CommTimestamp = "2019-01-01 00:00:07" },
            new { FcntUp = 2, CommTimestamp = "2019-01-01 00:00:08" },
            new { FcntUp = 3, CommTimestamp = "2019-01-01 00:00:09" }
        };

        //group by FcntUp and CommTimestamp
        var query = messages.GroupBy(x => new { x.FcntUp, x.CommTimestamp });

        //declare the current item
        dynamic currentItem = null;

        //declare the list of ranges
        List<dynamic> range = null;

        //loop through the sorted list
        foreach (var item in query)
        {
            //check if start of new range
            if (currentItem == null || item.Key.FcntUp < currentItem.Key.FcntUp)
            {
                //create a new list if the FcntUp starts on a new range
                range = new List<dynamic>();

                //add the list to the parent list
                arraylist.Add(range);
            }

            //add the item to the sublist
            range.Add(item);

            //set the current item
            currentItem = item;
        }
0

If the source collection implements IList < T > (random access by index) we could use the below approach. It only fetches the elements when they are really accessed, so this is particularly useful for lazily-evaluated collections. Similar to unbounded IEnumerable< T >, but for IList< T >.

    public static IEnumerable<IEnumerable<T>> Chunkify<T>(this IList<T> src, int chunkSize)
    {
        if (src == null) throw new ArgumentNullException(nameof(src));
        if (chunkSize < 1) throw new ArgumentOutOfRangeException(nameof(chunkSize), $"must be > 0, got {chunkSize}");

        for(var ci = 0; ci <= src.Count/chunkSize; ci++){
            yield return Window(src, ci*chunkSize, Math.Min((ci+1)*chunkSize, src.Count)-1);
        }
    }

    private static IEnumerable<T> Window<T>(IList<T> src, int startIdx, int endIdx)
    {
        Console.WriteLine($"window {startIdx} - {endIdx}");
        while(startIdx <= endIdx){
            yield return src[startIdx++];
        }
    }
Zar Shardan
  • 5,675
  • 2
  • 39
  • 37
-1

There is no way to combine all desirable features like full lazyness, no copying, full generality and safety in one solution. The most fundamental reason is that it cannot be guaranteed that the input is not mutated before the chunks are accessed. Assume that we have a function of the following signature:

public static IEnumerable<IEnumerable<T>> Chunk<T>(this IEnumerable<T> source, int chunkSize)
{
    // Some implementation
}

Then the following way to use it is problematic:

var myList = new List<int>()
{
    1,2,3,4
};
var myChunks = myList.Chunk(2);
myList.RemoveAt(0);
var firstChunk = myChunks.First();    
Console.WriteLine("First chunk:" + String.Join(',', firstChunk));
myList.RemoveAt(0);
var secondChunk = myChunks.Skip(1).First();
Console.WriteLine("Second chunk:" + String.Join(',', secondChunk));
// What outputs do we see for first and second chunk? Probably not what you would expect...

Depending on the specific implementation the code will either fail with a runtime error or produce unintuitive results.

So, at least one of the properties need to be weakened. If you want a bullet-proof lazy solution you need to restrict your input type to an immutable type and even then it is not straightforward to cover up all use cases. If you have control of the usage, you could, however, still opt for the most general solution, as long as you make sure that it is used in a way that works. Otherwise, you might drop the lazyness and accept some amount of copying.

In the end, it all depends on your use case and requirements which solution is the best choice for you.

KloppyToppy
  • 244
  • 2
  • 5