124

Is there a nice way to split a collection into n parts with LINQ? Not necessarily evenly of course.

That is, I want to divide the collection into sub-collections, which each contains a subset of the elements, where the last collection can be ragged.

jpaugh
  • 6,634
  • 4
  • 38
  • 90
Simon_Weaver
  • 140,023
  • 84
  • 646
  • 689
  • How exactly do you want them split, if not even (allowing for the end, of course)? – Marc Gravell Jan 13 '09 at 07:53
  • Related question [Split List into Sublists with LINQ](http://stackoverflow.com/questions/419019/split-list-into-sublists-with-linq) – Gennady Vanin Геннадий Ванин Apr 12 '13 at 02:47
  • @Simon_Weaver I tried to clarify what you're asking based on the accepted answer. In fact, there are many ways to 'split' a list, including decomposing each element of the list into its elements, and putting those in so-called 'parallel' lists. – jpaugh May 20 '17 at 19:36

21 Answers21

132

A pure linq and the simplest solution is as shown below.

static class LinqExtensions
{
    public static IEnumerable<IEnumerable<T>> Split<T>(this IEnumerable<T> list, int parts)
    {
        int i = 0;
        var splits = from item in list
                     group item by i++ % parts into part
                     select part.AsEnumerable();
        return splits;
    }
}
jpaugh
  • 6,634
  • 4
  • 38
  • 90
Muhammad Hasan Khan
  • 34,648
  • 16
  • 88
  • 131
  • 4
    Doing all those modulus operations can get a bit expensive on long lists. – Jonathan Allen Mar 17 '09 at 18:11
  • 13
    It would be better to use the Select overload that includes the index. – Marc Gravell Aug 17 '09 at 21:21
  • 1
    I've added a response that uses the select overload and method chaining syntax – reustmd Mar 14 '11 at 20:21
  • @HasanKhan: What if I dont know how many parts I'm splitting it into. I mean, split into groups of 3, lets say. – Robin Maben Nov 29 '11 at 13:30
  • My not work, I just have one Splits whit count 1 and all my items inside this One position, any idea? – Lucas Rodrigues Sena Jan 22 '15 at 19:36
  • @LucasRodriguesSena This method asks you for no. of splits, not how much you want in each split. You can get the no. of splits by dividing your desired size by total no. of items. – Muhammad Hasan Khan Jan 23 '15 at 15:26
  • 2
    `.AsEnumerable()` is not necessary, IGrouping is already an IEnumerable. – Alex Jul 22 '15 at 17:28
  • @Alex If you return part directly return type would be IEnumerable> instead of IEnumerable> – Muhammad Hasan Khan Jul 23 '15 at 17:01
  • @HasanKhan I didn't run the code but I tried and it compiles just fine. I think since the return type is explicitly `IEnumerable>` an implicit cast is taking place when you return part. No? – Alex Jul 23 '15 at 19:09
  • @Alex that's right. When I wrote it covariance wasn't introduce in C#. – Muhammad Hasan Khan Jul 23 '15 at 23:29
  • @HasanKhan no harms done. Being explicit in code usually is good thing. – Alex Jul 24 '15 at 20:29
  • For VB.NET, I had difficulty with query syntax due to lack of i++ in VB (and struggled with the Into keyword), so I used method syntax instead: `Dim i As Integer = 0 Dim splits As IEnumerable(Of IEnumerable(Of T)) = list.GroupBy(Function(item) Dim groupRemainder As Integer = i Mod parts i += 1 Return groupRemainder End Function).Select(Function(group) group.AsEnumerable())` – chinookf Jul 19 '17 at 13:02
  • This code does not do the same thing as .NET 6 Chunk. If you call .NET 6 Chunk with 200 items, and a chunk value of 100, you will get 2 chunks. If you do the same thing with the code above, you'll get 100 chunks, each with 2 items. – Alex Dresko Feb 02 '22 at 15:24
62

EDIT: Okay, it looks like I misread the question. I read it as "pieces of length n" rather than "n pieces". Doh! Considering deleting answer...

(Original answer)

I don't believe there's a built-in way of partitioning, although I intend to write one in my set of additions to LINQ to Objects. Marc Gravell has an implementation here although I would probably modify it to return a read-only view:

public static IEnumerable<IEnumerable<T>> Partition<T>
    (this IEnumerable<T> source, int size)
{
    T[] array = null;
    int count = 0;
    foreach (T item in source)
    {
        if (array == null)
        {
            array = new T[size];
        }
        array[count] = item;
        count++;
        if (count == size)
        {
            yield return new ReadOnlyCollection<T>(array);
            array = null;
            count = 0;
        }
    }
    if (array != null)
    {             
        Array.Resize(ref array, count);
        yield return new ReadOnlyCollection<T>(array);
    }
}
Community
  • 1
  • 1
Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • 3
    You *really* don't like those "array[count++]", eh ;-p – Marc Gravell Jan 13 '09 at 07:54
  • Ironically I might well have used it before yesterday. But having written that answer it would have been hypocritical - and looking at both versions I think this *is* slightly easier to read at a glance. At first I used a list instead - Add is even more readable :) – Jon Skeet Jan 13 '09 at 08:03
  • Well, I just added a slightly different answer (to address "n pieces" rather than "pieces of length n"), and mixed in parts of your version ;-p – Marc Gravell Jan 13 '09 at 08:09
  • of course this time around I needed pieces of 8 rather than 8 pieces :-) and this worked great – Simon_Weaver Oct 21 '11 at 08:37
  • 22
    Thank you for not deleting even though it is not an answer for the OP, I wanted the exact same thing - pieces of length n :). – Gishu Dec 06 '12 at 04:57
  • @Jon, I just wonder why you prefer the `ReadOnlyCollection` to a _read-only array_ in this case? – CodeFox Feb 07 '14 at 16:02
  • @Jon: Ah - good point! I have had the fixed length of arrays in mind, but of course - in contrast to the `ReadOnlyCollection` - the elements could be replaced. Thanks for your reply. – CodeFox Feb 10 '14 at 09:37
  • @Gishu - Indeed - it's exactly what i needed too! – Frank Tzanabetis Mar 21 '14 at 02:35
  • The problem with this approach is that it traverses the complete source list (requiring to have the whole list in-memory) before moving on. Does anyone know a better solution for large enumerables? – Dejan Apr 02 '14 at 11:33
  • 3
    @Dejan: No it doesn't. Note the use of `yield return`. It requires one *batch* to be in memory at a time, but that's all. – Jon Skeet Apr 02 '14 at 11:38
  • Yes, I've seen the `yield return` and of course my comment "needs to traverse the complete list" is plainly wrong. Sorry. What happened was this: I was trying to use the method in a Parallel.ForEach in order to partition my data, which exploded in memory usage. I guess that the TPL spawns too many Tasks because the `yield return` only comes *after* filling a partition and so the first phase will be just preparing partitions in parallel. I guess what I need is a proper implementation of http://msdn.microsoft.com/en-us/library/dd381768(VS.100).aspx. – Dejan Apr 03 '14 at 09:25
  • 1
    @Dejan: Right - I wouldn't like to guess about how it interacts with Parallel LINQ partitioning, to be honest :) – Jon Skeet Apr 03 '14 at 09:40
  • Bufferization is a disadvantage if you have big chunks. I think the best solution is here (Jeppe Stig Nielsen): http://stackoverflow.com/questions/13709626/split-an-ienumerablet-into-fixed-sized-chunks-return-an-ienumerableienumerab/13711499#13711499 – SalientBrain Aug 26 '14 at 22:28
  • @SalientBrain: Um, Jeppe Stig Nielsen's solution still uses batching - note the call to `ToList` in `yield return GetNextBatch(enumerator).ToList();`. Batching is *necessary* as otherwise you end up with very odd results if you skip one partition. – Jon Skeet Aug 27 '14 at 05:49
  • @Jon Skeet: Yep, I missed that)) So no magic(. Same implementation in other words. Thanks. – SalientBrain Aug 27 '14 at 12:59
50
static class LinqExtensions
{
    public static IEnumerable<IEnumerable<T>> Split<T>(this IEnumerable<T> list, int parts)
    {
            return list.Select((item, index) => new {index, item})
                       .GroupBy(x => x.index % parts)
                       .Select(x => x.Select(y => y.item));
    }
}
reustmd
  • 3,513
  • 5
  • 30
  • 41
  • 39
    I have an irrational dislike of SQL-style Linq, so this is my favorite answer. – piedar Jan 02 '14 at 19:53
  • 1
    @manu08, I have tried ur code, i have a list `var dept = {1,2,3,4,5}` . After splitting the result is like `dept1 = {1,3,5}` and `dept2 = { 2,4 }` where `parts = 2`. But the result i need is `dept1 = {1,2,3}` and `dept2 = {4,5}` – Karthik Arthik Jan 13 '16 at 06:09
  • 4
    I had the same issue with modulo, so I calculated the column length with `int columnLength = (int)Math.Ceiling((decimal)(list.Count()) / parts);` then did division with `.GroupBy(x => x.index / columnLength)`. One downside is Count() enumerates the list. – goodeye Feb 03 '17 at 00:09
  • 1
    This method (and others like it) relies on the list size being larger than the number of parts it's being split into. – Edward Brey Mar 07 '21 at 12:33
28

Ok, I'll throw my hat in the ring. The advantages of my algorithm:

  1. No expensive multiplication, division, or modulus operators
  2. All operations are O(1) (see note below)
  3. Works for IEnumerable<> source (no Count property needed)
  4. Simple

The code:

public static IEnumerable<IEnumerable<T>>
  Section<T>(this IEnumerable<T> source, int length)
{
  if (length <= 0)
    throw new ArgumentOutOfRangeException("length");

  var section = new List<T>(length);

  foreach (var item in source)
  {
    section.Add(item);

    if (section.Count == length)
    {
      yield return section.AsReadOnly();
      section = new List<T>(length);
    }
  }

  if (section.Count > 0)
    yield return section.AsReadOnly();
}

As pointed out in the comments below, this approach doesn't actually address the original question which asked for a fixed number of sections of approximately equal length. That said, you can still use my approach to solve the original question by calling it this way:

myEnum.Section(myEnum.Count() / number_of_sections + 1)

When used in this manner, the approach is no longer O(1) as the Count() operation is O(N).

Mike
  • 7,500
  • 8
  • 44
  • 62
  • Brilliant - best solution here! A few optimizations: * Clear the linked list instead of creating a new one for each section. A reference to the linked list is never returned to the caller, so it's completely safe. * Don't create the linked list until you reach the first item - that way there's no allocation if the source is empty – ShadowChaser May 06 '11 at 17:24
  • 3
    @ShadowChaser According to MSDN clearing the LinkedList is O(N) complexity so it would ruin my goal of O(1). Of course, you could argue that the foreach is O(N) to begin with... :) – Mike May 09 '11 at 15:31
  • 4
    your answer is right, but the question is wrong for it. Your answer gives unknown number of chunks with fixed size for each chunk. But OP wants a Split functionality where it gives fixed number of chunks with any size per chunk(hopefully of equal or close to equal sizes). Perhaps more suited here http://stackoverflow.com/questions/3773403/linq-partition-list-into-lists-of-8-members – nawfal Dec 06 '12 at 14:35
  • @nawfal - I agree. I approached it this way because it doesn't require knowing the enumerable's length ahead of time, which is more efficient for some applications. If you want a fixed number of chunks then you can call myEnum.Section(myEnum.Count() / number_of_chunks + 1). I will update my answer to reflect this. – Mike Dec 06 '12 at 20:01
  • @Mike, why all the hassle of linkedlist, separate static function etc? Why not a straightforward approach which I believe makes the function even faster? In the [previous thread I posted, here's my answer](http://stackoverflow.com/a/13745058/661933) which is a similar implementation which is faster with no hassles of collection structures. – nawfal Dec 07 '12 at 01:24
  • @nawfal - I wanted an approach that was O(1) or "constant time." The Skip() inside your loop makes your approach O(N) because the time required to process the skip increases linearly with the number of items. Consider an enumerable with 1e12 items being separated into groups of two--the further you get into the enumerable, the slower it will go. I was also trying to avoid multiplication, which is expensive to compute relative to the other operations. – Mike Dec 07 '12 at 15:49
  • 1
    @Mike have you benchmarked it? I hope you know O(1) doesn't mean faster, it only means that the time required for partitioning doesn't scale. I'm just wondering what's your rationale to blindly stick to O(1) when it can be slower than other O(n)'s for all the real life scenarios. I even tested it for a crazy 10^8 strength list and mine appeared to be faster still. I hope you know there are not even standard collection types that can hold 10^12 items.. – nawfal Dec 08 '12 at 10:25
  • Few points: 1) why do you keep insisting that multiplication/modulo operations etc are expensive? 2) `myEnum.Section(myEnum.Count() / number_of_sections + 1)` enumerates the first collection (and for the given question can be inefficient), so pls edit the advantages you have mentioned for your approach. – nawfal Dec 08 '12 at 11:41
  • 3) You may use a lightweight collection structure instead of LinkedList as in Jon Skeet's or Brad's answers, they tend to be faster. As I said O(1) doesn't guarantee speed. The possible overhead with LinkedList outweighs its gains here (they are meant for random removal and insertion). See this [link here](http://stackoverflow.com/questions/169973/when-should-i-use-a-list-vs-a-linkedlist) – nawfal Dec 08 '12 at 11:45
  • 1
    @nawfal - Thank you for your detailed analysis, it helps keep me on my toes. Linked lists in general are known for efficient end-inserts, which is why I selected it here. However I just benchmarked it and indeed the List<> is much faster. I suspect this is some sort of .NET implementation detail, perhaps deserving of a separate StackOverflow question. I have modified my answer to use List<> as per your suggestion. Preallocating the list capacity guarantees that end-insertion is still O(1) and meets my original design goal. I also switched to the built-in .AsReadOnly() in .NET 4.5. – Mike Dec 10 '12 at 15:58
  • @nawfal - At the machine code level, mult/div/mod are generally much more expensive (by an order of magnitude) than simple operations such as set/get/compare. Of course it varies by processor, compiler, and implementation, so you are indeed wise to benchmark your specific case if this sectioning operation is creating a bottleneck. – Mike Dec 10 '12 at 16:09
  • Sorry, this may be a dumb question, but why do we need .AsReadOnly() call here? – ironic Sep 05 '13 at 08:24
  • Technically you don't need it but it's kind of a "best practice" since we're returning an IEnumerable<> which implies that it is immutable. – Mike Sep 05 '13 at 17:09
17

This is same as the accepted answer, but a much simpler representation:

public static IEnumerable<IEnumerable<T>> Split<T>(this IEnumerable<T> items, 
                                                   int numOfParts)
{
    int i = 0;
    return items.GroupBy(x => i++ % numOfParts);
}

The above method splits an IEnumerable<T> into N number of chunks of equal sizes or close to equal sizes.

public static IEnumerable<IEnumerable<T>> Partition<T>(this IEnumerable<T> items, 
                                                       int partitionSize)
{
    int i = 0;
    return items.GroupBy(x => i++ / partitionSize).ToArray();
}

The above method splits an IEnumerable<T> into chunks of desired fixed size with total number of chunks being unimportant - which is not what the question is about.

The problem with the Split method, besides being slower, is that it scrambles the output in the sense that the grouping will be done on the basis of i'th multiple of N for each position, or in other words you don't get the chunks in the original order.

Almost every answer here either doesn't preserve order, or is about partitioning and not splitting, or is plainly wrong. Try this which is faster, preserves order but a lil' more verbose:

public static IEnumerable<IEnumerable<T>> Split<T>(this ICollection<T> items, 
                                                   int numberOfChunks)
{
    if (numberOfChunks <= 0 || numberOfChunks > items.Count)
        throw new ArgumentOutOfRangeException("numberOfChunks");

    int sizePerPacket = items.Count / numberOfChunks;
    int extra = items.Count % numberOfChunks;

    for (int i = 0; i < numberOfChunks - extra; i++)
        yield return items.Skip(i * sizePerPacket).Take(sizePerPacket);

    int alreadyReturnedCount = (numberOfChunks - extra) * sizePerPacket;
    int toReturnCount = extra == 0 ? 0 : (items.Count - numberOfChunks) / extra + 1;
    for (int i = 0; i < extra; i++)
        yield return items.Skip(alreadyReturnedCount + i * toReturnCount).Take(toReturnCount);
}

The equivalent method for a Partition operation here

Community
  • 1
  • 1
nawfal
  • 70,104
  • 56
  • 326
  • 368
  • My preferred answer as it maintains the order of the list, I had to make a modification to evenly distribute the "extra" chunks uniformly throughout the data set though. – Paul Hodgson Nov 12 '21 at 15:05
6

I have been using the Partition function I posted earlier quite often. The only bad thing about it was that is wasn't completely streaming. This is not a problem if you work with few elements in your sequence. I needed a new solution when i started working with 100.000+ elements in my sequence.

The following solution is a lot more complex (and more code!), but it is very efficient.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Collections;

namespace LuvDaSun.Linq
{
    public static class EnumerableExtensions
    {
        public static IEnumerable<IEnumerable<T>> Partition<T>(this IEnumerable<T> enumerable, int partitionSize)
        {
            /*
            return enumerable
                .Select((item, index) => new { Item = item, Index = index, })
                .GroupBy(item => item.Index / partitionSize)
                .Select(group => group.Select(item => item.Item)                )
                ;
            */

            return new PartitioningEnumerable<T>(enumerable, partitionSize);
        }

    }


    class PartitioningEnumerable<T> : IEnumerable<IEnumerable<T>>
    {
        IEnumerable<T> _enumerable;
        int _partitionSize;
        public PartitioningEnumerable(IEnumerable<T> enumerable, int partitionSize)
        {
            _enumerable = enumerable;
            _partitionSize = partitionSize;
        }

        public IEnumerator<IEnumerable<T>> GetEnumerator()
        {
            return new PartitioningEnumerator<T>(_enumerable.GetEnumerator(), _partitionSize);
        }

        IEnumerator IEnumerable.GetEnumerator()
        {
            return GetEnumerator();
        }
    }


    class PartitioningEnumerator<T> : IEnumerator<IEnumerable<T>>
    {
        IEnumerator<T> _enumerator;
        int _partitionSize;
        public PartitioningEnumerator(IEnumerator<T> enumerator, int partitionSize)
        {
            _enumerator = enumerator;
            _partitionSize = partitionSize;
        }

        public void Dispose()
        {
            _enumerator.Dispose();
        }

        IEnumerable<T> _current;
        public IEnumerable<T> Current
        {
            get { return _current; }
        }
        object IEnumerator.Current
        {
            get { return _current; }
        }

        public void Reset()
        {
            _current = null;
            _enumerator.Reset();
        }

        public bool MoveNext()
        {
            bool result;

            if (_enumerator.MoveNext())
            {
                _current = new PartitionEnumerable<T>(_enumerator, _partitionSize);
                result = true;
            }
            else
            {
                _current = null;
                result = false;
            }

            return result;
        }

    }



    class PartitionEnumerable<T> : IEnumerable<T>
    {
        IEnumerator<T> _enumerator;
        int _partitionSize;
        public PartitionEnumerable(IEnumerator<T> enumerator, int partitionSize)
        {
            _enumerator = enumerator;
            _partitionSize = partitionSize;
        }

        public IEnumerator<T> GetEnumerator()
        {
            return new PartitionEnumerator<T>(_enumerator, _partitionSize);
        }

        IEnumerator IEnumerable.GetEnumerator()
        {
            return GetEnumerator();
        }
    }


    class PartitionEnumerator<T> : IEnumerator<T>
    {
        IEnumerator<T> _enumerator;
        int _partitionSize;
        int _count;
        public PartitionEnumerator(IEnumerator<T> enumerator, int partitionSize)
        {
            _enumerator = enumerator;
            _partitionSize = partitionSize;
        }

        public void Dispose()
        {
        }

        public T Current
        {
            get { return _enumerator.Current; }
        }
        object IEnumerator.Current
        {
            get { return _enumerator.Current; }
        }
        public void Reset()
        {
            if (_count > 0) throw new InvalidOperationException();
        }

        public bool MoveNext()
        {
            bool result;

            if (_count < _partitionSize)
            {
                if (_count > 0)
                {
                    result = _enumerator.MoveNext();
                }
                else
                {
                    result = true;
                }
                _count++;
            }
            else
            {
                result = false;
            }

            return result;
        }

    }
}

Enjoy!

Elmer
  • 9,147
  • 2
  • 48
  • 38
  • This version breaks the contract of IEnumerator. It's not valid to throw InvalidOperationException when Reset is called - I believe many of the LINQ extension methods rely on this behavior. – ShadowChaser May 06 '11 at 17:05
  • 2
    @ShadowChaser I think Reset() should throw a NotSupportedException and everything would be fine. From the MSDN documentation: "The Reset method is provided for COM interoperability. It does not necessarily need to be implemented; instead, the implementer can simply throw a NotSupportedException." – toong May 17 '13 at 12:25
  • @toong Wow, you're right. Not sure how I missed that after all this time. – ShadowChaser May 27 '13 at 15:38
  • It is buggy! I don't remember exactly, but (as far as I remember) it performs unwanted step and it can lead to ugly side effects (with datareader for ex.). The best solution is here (Jeppe Stig Nielsen): http://stackoverflow.com/questions/13709626/split-an-ienumerablet-into-fixed-sized-chunks-return-an-ienumerableienumerab/13711499#13711499 – SalientBrain Aug 26 '14 at 22:24
4

Interesting thread. To get a streaming version of Split/Partition, one can use enumerators and yield sequences from the enumerator using extension methods. Converting imperative code to functional code using yield is a very powerful technique indeed.

First an enumerator extension that turns a count of elements into a lazy sequence:

public static IEnumerable<T> TakeFromCurrent<T>(this IEnumerator<T> enumerator, int count)
{
    while (count > 0)
    {
        yield return enumerator.Current;
        if (--count > 0 && !enumerator.MoveNext()) yield break;
    }
}

And then an enumerable extension that partitions a sequence:

public static IEnumerable<IEnumerable<T>> Partition<T>(this IEnumerable<T> seq, int partitionSize)
{
    var enumerator = seq.GetEnumerator();

    while (enumerator.MoveNext())
    {
        yield return enumerator.TakeFromCurrent(partitionSize);
    }
}

The end result is a highly efficient, streaming and lazy implementation that relies on very simple code.

Enjoy!

  • I initially programmed the same thing, but the pattern breaks when Reset is called on one of the nested IEnumerable instances. – ShadowChaser May 06 '11 at 17:13
  • 1
    Does this still work if you only enumerate the partition and not the inner enumerable? since the inner enumerator is deferred then none of the code for the inner (take from current) will execute until it is enumerated therefore movenext() will only be called by the outer partition function, right? If my assumptions are true then this can potentially yield n partitions with n elements in the original enumerable and the inner enumerables will yield unexpected results – Brad Nov 16 '11 at 19:11
  • @Brad it will "fail" as you expect, similar to some of the issues in this thread http://stackoverflow.com/questions/419019/split-list-into-sublists-with-linq (specifically http://stackoverflow.com/a/20953521/1037948) – drzaus Jun 10 '15 at 20:00
3

I use this:

public static IEnumerable<IEnumerable<T>> Partition<T>(this IEnumerable<T> instance, int partitionSize)
{
    return instance
        .Select((value, index) => new { Index = index, Value = value })
        .GroupBy(i => i.Index / partitionSize)
        .Select(i => i.Select(i2 => i2.Value));
}
nawfal
  • 70,104
  • 56
  • 326
  • 368
Elmer
  • 9,147
  • 2
  • 48
  • 38
  • Please explain why. I have been using this function without any trouble! – Elmer Dec 02 '11 at 02:58
  • read the question again and see if you get n (almost) equal length parts with your function – Muhammad Hasan Khan Dec 02 '11 at 08:13
  • @Elmer your answer is right, but the question is wrong for it. Your answer gives unknown number of chunks with fixed size for each chunk (exactly as Partition, the name you have given for it). But OP wants a Split functionality where it gives fixed number of chunks with any size per chunk(hopefully of equal or close to equal sizes). Perhaps more suited here http://stackoverflow.com/questions/3773403/linq-partition-list-into-lists-of-8-members – nawfal Dec 06 '12 at 14:36
  • I think you can just change i.Index / partitionSize to i.Index % partitionSize and get the requested result. I also prefer this over the accepted answer as it is more compact and readable. – Jake Drew Jul 28 '15 at 16:45
3

As of .NET 6 you can use Enumerable.Chunk<TSource>(IEnumerable<TSource>, Int32).

Mike
  • 7,500
  • 8
  • 44
  • 62
2

There are lots of great answers for this question (and its cousins). I needed this myself and had created a solution that is designed to be efficient and error tolerant in a scenario where the source collection can be treated as a list. It does not use any lazy iteration so it may not be suitable for collections of unknown size that may apply memory pressure.

static public IList<T[]> GetChunks<T>(this IEnumerable<T> source, int batchsize)
{
    IList<T[]> result = null;
    if (source != null && batchsize > 0)
    {
        var list = source as List<T> ?? source.ToList();
        if (list.Count > 0)
        {
            result = new List<T[]>();
            for (var index = 0; index < list.Count; index += batchsize)
            {
                var rangesize = Math.Min(batchsize, list.Count - index);
                result.Add(list.GetRange(index, rangesize).ToArray());
            }
        }
    }
    return result ?? Enumerable.Empty<T[]>().ToList();
}

static public void TestGetChunks()
{
    var ids = Enumerable.Range(1, 163).Select(i => i.ToString());
    foreach (var chunk in ids.GetChunks(20))
    {
        Console.WriteLine("[{0}]", String.Join(",", chunk));
    }
}

I have seen a few answers across this family of questions that use GetRange and Math.Min. But I believe that overall this is a more complete solution in terms of error checking and efficiency.

rhaben
  • 159
  • 2
  • 9
2

This is memory efficient and defers execution as much as possible (per batch) and operates in linear time O(n)

    public static IEnumerable<IEnumerable<T>> InBatchesOf<T>(this IEnumerable<T> items, int batchSize)
    {
        List<T> batch = new List<T>(batchSize);
        foreach (var item in items)
        {
            batch.Add(item);

            if (batch.Count >= batchSize)
            {
                yield return batch;
                batch = new List<T>();
            }
        }

        if (batch.Count != 0)
        {
            //can't be batch size or would've yielded above
            batch.TrimExcess();
            yield return batch;
        }
    }
Brad
  • 705
  • 7
  • 17
1
   protected List<List<int>> MySplit(int MaxNumber, int Divider)
        {
            List<List<int>> lst = new List<List<int>>();
            int ListCount = 0;
            int d = MaxNumber / Divider;
            lst.Add(new List<int>());
            for (int i = 1; i <= MaxNumber; i++)
            {
                lst[ListCount].Add(i);
                if (i != 0 && i % d == 0)
                {
                    ListCount++;
                    d += MaxNumber / Divider;
                    lst.Add(new List<int>());
                }
            }
            return lst;
        }
Amit Sengar
  • 131
  • 11
1

Great Answers, for my scenario i tested the accepted answer , and it seems it does not keep order. there is also great answer by Nawfal that keeps order. But in my scenario i wanted to split the remainder in a normalized way, all answers i saw spread the remainder or at the beginning or at the end.

My answer also takes the remainder spreading in more normalized way.

 static class Program
{          
    static void Main(string[] args)
    {
        var input = new List<String>();
        for (int k = 0; k < 18; ++k)
        {
            input.Add(k.ToString());
        }
        var result = splitListIntoSmallerLists(input, 15);            
        int i = 0;
        foreach(var resul in result){
            Console.WriteLine("------Segment:" + i.ToString() + "--------");
            foreach(var res in resul){
                Console.WriteLine(res);
            }
            i++;
        }
        Console.ReadLine();
    }

    private static List<List<T>> splitListIntoSmallerLists<T>(List<T> i_bigList,int i_numberOfSmallerLists)
    {
        if (i_numberOfSmallerLists <= 0)
            throw new ArgumentOutOfRangeException("Illegal value of numberOfSmallLists");

        int normalizedSpreadRemainderCounter = 0;
        int normalizedSpreadNumber = 0;
        //e.g 7 /5 > 0 ==> output size is 5 , 2 /5 < 0 ==> output is 2          
        int minimumNumberOfPartsInEachSmallerList = i_bigList.Count / i_numberOfSmallerLists;                        
        int remainder = i_bigList.Count % i_numberOfSmallerLists;
        int outputSize = minimumNumberOfPartsInEachSmallerList > 0 ? i_numberOfSmallerLists : remainder;
        //In case remainder > 0 we want to spread the remainder equally between the others         
        if (remainder > 0)
        {
            if (minimumNumberOfPartsInEachSmallerList > 0)
            {
                normalizedSpreadNumber = (int)Math.Floor((double)i_numberOfSmallerLists / remainder);    
            }
            else
            {
                normalizedSpreadNumber = 1;
            }   
        }
        List<List<T>> retVal = new List<List<T>>(outputSize);
        int inputIndex = 0;            
        for (int i = 0; i < outputSize; ++i)
        {
            retVal.Add(new List<T>());
            if (minimumNumberOfPartsInEachSmallerList > 0)
            {
                retVal[i].AddRange(i_bigList.GetRange(inputIndex, minimumNumberOfPartsInEachSmallerList));
                inputIndex += minimumNumberOfPartsInEachSmallerList;
            }
            //If we have remainder take one from it, if our counter is equal to normalizedSpreadNumber.
            if (remainder > 0)
            {
                if (normalizedSpreadRemainderCounter == normalizedSpreadNumber-1)
                {
                    retVal[i].Add(i_bigList[inputIndex]);
                    remainder--;
                    inputIndex++;
                    normalizedSpreadRemainderCounter=0;
                }
                else
                {
                    normalizedSpreadRemainderCounter++;
                }
            }
        }
        return retVal;
    }      

}
Robocide
  • 6,353
  • 4
  • 37
  • 41
0

This is my way, listing items and breaking row by columns

  int repat_count=4;

  arrItems.ForEach((x, i) => {
    if (i % repat_count == 0) 
        row = tbo.NewElement(el_tr, cls_min_height);
    var td = row.NewElement(el_td);
    td.innerHTML = x.Name;
  });
IlPADlI
  • 1,943
  • 18
  • 22
0

I was looking for a split like the one with string, so the whole List is splitted according to some rule, not only the first part, this is my solution

List<int> sequence = new List<int>();
for (int i = 0; i < 2000; i++)
{
     sequence.Add(i);
}
int splitIndex = 900;
List<List<int>> splitted = new List<List<int>>();
while (sequence.Count != 0)
{
    splitted.Add(sequence.Take(splitIndex).ToList() );
    sequence.RemoveRange(0, Math.Min(splitIndex, sequence.Count));
}
Adel
  • 91
  • 4
0

Here is a little tweak for the number of items instead of the number of parts:

public static class MiscExctensions
{
    public static IEnumerable<IEnumerable<T>> Split<T>(this IEnumerable<T> list, int nbItems)
    {
        return (
            list
            .Select((o, n) => new { o, n })
            .GroupBy(g => (int)(g.n / nbItems))
            .Select(g => g.Select(x => x.o))
        );
    }
}
theblindprophet
  • 7,767
  • 5
  • 37
  • 55
J-B
  • 1
0

If order in these parts is not very important you can try this:

int[] array = new int[] { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
int n = 3;

var result =
   array.Select((value, index) => new { Value = value, Index = index }).GroupBy(i => i.Index % n, i => i.Value);

// or
var result2 =
   from i in array.Select((value, index) => new { Value = value, Index = index })
   group i.Value by i.Index % n into g
   select g;

However these can't be cast to IEnumerable<IEnumerable<int>> by some reason...

okutane
  • 13,754
  • 10
  • 59
  • 67
  • It can be done. Instead of direct casting just make the function generic and then call it for your int array – nawfal Dec 07 '12 at 01:19
0

This is my code, nice and short.

 <Extension()> Public Function Chunk(Of T)(ByVal this As IList(Of T), ByVal size As Integer) As List(Of List(Of T))
     Dim result As New List(Of List(Of T))
     For i = 0 To CInt(Math.Ceiling(this.Count / size)) - 1
         result.Add(New List(Of T)(this.GetRange(i * size, Math.Min(size, this.Count - (i * size)))))
     Next
     Return result
 End Function
Jonathan Allen
  • 68,373
  • 70
  • 259
  • 447
0

below code returns both given number of chunks also with sorted data

    static IEnumerable<IEnumerable<T>> SplitSequentially<T>(int chunkParts, List<T> inputList)
    {
        List<int> Splits = split(inputList.Count, chunkParts);

        var skipNumber = 0;
        List<List<T>> list = new List<List<T>>();
        foreach (var count in Splits)
        {
            var internalList = inputList.Skip(skipNumber).Take(count).ToList();
            list.Add(internalList);
            skipNumber += count;
        }
        return list;
    }
    static List<int> split(int x, int n)
    {
        List<int> list = new List<int>();

        if (x % n == 0)
        {
            for (int i = 0; i < n; i++)
                list.Add(x / n);
        }
        else
        {

            // upto n-(x % n) the values 
            // will be x / n 
            // after that the values 
            // will be x / n + 1 
            int zp = n - (x % n);
            int pp = x / n;
            for (int i = 0; i < n; i++)
            {

                if (i >= zp)
                    list.Add((pp + 1));
                else
                    list.Add(pp);
            }
        }
        return list;
    }
deepak
  • 1
  • 2
-1

Just came across this thread, and most of the solutions here involve adding items to collections, effectively materialising each page before returning it. This is bad for two reasons - firstly if your pages are large there's a memory overhead to filling the page, secondly there are iterators which invalidate previous records when you advance to the next one (for example if you wrap a DataReader within an enumerator method).

This solution uses two nested enumerator methods to avoid any need to cache items into temporary collections. Since the outer and inner iterators are traversing the same enumerable, they necessarily share the same enumerator, so it's important not to advance the outer one until you're done with processing the current page. That said, if you decide not to iterate all the way through the current page, when you move to the next page this solution will iterate forward to the page boundary automatically.

using System.Collections.Generic;

public static class EnumerableExtensions
{
    /// <summary>
    /// Partitions an enumerable into individual pages of a specified size, still scanning the source enumerable just once
    /// </summary>
    /// <typeparam name="T">The element type</typeparam>
    /// <param name="enumerable">The source enumerable</param>
    /// <param name="pageSize">The number of elements to return in each page</param>
    /// <returns></returns>
    public static IEnumerable<IEnumerable<T>> Partition<T>(this IEnumerable<T> enumerable, int pageSize)
    {
        var enumerator = enumerable.GetEnumerator();

        while (enumerator.MoveNext())
        {
            var indexWithinPage = new IntByRef { Value = 0 };

            yield return SubPartition(enumerator, pageSize, indexWithinPage);

            // Continue iterating through any remaining items in the page, to align with the start of the next page
            for (; indexWithinPage.Value < pageSize; indexWithinPage.Value++)
            {
                if (!enumerator.MoveNext())
                {
                    yield break;
                }
            }
        }
    }

    private static IEnumerable<T> SubPartition<T>(IEnumerator<T> enumerator, int pageSize, IntByRef index)
    {
        for (; index.Value < pageSize; index.Value++)
        {
            yield return enumerator.Current;

            if (!enumerator.MoveNext())
            {
                yield break;
            }
        }
    }

    private class IntByRef
    {
        public int Value { get; set; }
    }
}
Jon G
  • 4,083
  • 22
  • 27
  • This is not working at all! The best possible is here http://stackoverflow.com/questions/13709626/split-an-ienumerablet-into-fixed-sized-chunks-return-an-ienumerableienumerab/13711499#13711499! See comments. – SalientBrain Aug 26 '14 at 22:09
-1
int[] items = new int[] { 0,1,2,3,4,5,6,7,8,9, 10 };

int itemIndex = 0;
int groupSize = 2;
int nextGroup = groupSize;

var seqItems = from aItem in items
               group aItem by 
                            (itemIndex++ < nextGroup) 
                            ? 
                            nextGroup / groupSize
                            :
                            (nextGroup += groupSize) / groupSize
                            into itemGroup
               select itemGroup.AsEnumerable();