57

If I have an IEnumerable like:

string[] items = new string[] { "a", "b", "c", "d" };

I would like to loop thru all the pairs of consecutive items (sliding window of size 2). Which would be

("a", "b"), ("b", "c"), ("c", "d")

My solution was is this

    public static IEnumerable<Pair<T, T>> Pairs(IEnumerable<T> enumerable) {
        IEnumerator<T> e = enumerable.GetEnumerator(); e.MoveNext();
        T current = e.Current;
        while ( e.MoveNext() ) {
            T next = e.Current;
            yield return new Pair<T, T>(current, next);
            current = next;
        }
    }
   
 // used like this :
 foreach (Pair<String,String> pair in IterTools<String>.Pairs(items)) {
    System.Out.PrintLine("{0}, {1}", pair.First, pair.Second)
 }

When I wrote this code, I wondered if there are already functions in the .NET framework that do the same thing and do it not just for pairs but for any size tuples. IMHO there should be a nice way to do this kind of sliding window operations.

I use C# 2.0 and I can imagine that with C# 3.0 (using LINQ) there are more and nicer ways to do this, but I'm primarily interested in C# 2.0 solutions. Though, I will also appreciate C# 3.0 solutions.

Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104
f3lix
  • 29,500
  • 10
  • 66
  • 86
  • This seems like it could share a lot of implementation with Jon Skeet's `SmartEnumerator` which tells you whether an item is the last in the list. http://msmvps.com/blogs/jon_skeet/archive/2007/07/27/smart-enumerations.aspx – Ben Voigt Aug 02 '10 at 18:30
  • For reference, this function is called 'Windowed' in F#: http://stackoverflow.com/questions/8874901/is-there-an-equivalent-to-the-f-seq-windowed-in-c – Benjol Dec 12 '14 at 06:17
  • Somewhat related: [Get previous and next item in a IEnumerable using LINQ](https://stackoverflow.com/questions/8759849/get-previous-and-next-item-in-a-ienumerable-using-linq). – Theodor Zoulias Feb 05 '23 at 15:25

15 Answers15

60

In .NET 4 this becomes even easier:-

var input = new[] { "a", "b", "c", "d", "e", "f" };
var result = input.Zip(input.Skip(1), (a, b) => Tuple.Create(a, b));
Micha Wiedenmann
  • 19,979
  • 21
  • 92
  • 137
Ian Mercer
  • 38,490
  • 8
  • 97
  • 133
  • 24
    It's worth mentioning that this evaluates `input` twice - not a problem for an array, but if it's lazily-evaluated that could be expensive. – dahlbyk Aug 18 '12 at 01:52
  • 8
    Also, the second argument to `Zip` can be passed as a method group: `…input.Zip(input.Skip(1), Tuple.Create);` – Jay Mar 11 '15 at 17:32
  • 4
    I just did this in a unit test, just to see the difference. With `Enumerable.Range(0, count)` as the iterator, I had to increase the count to ~1 million before the delay was noticeable, and ~10 million before it was slow enough to bother me. Still, @dahlbyk's solution elegantly avoids this, so I'd use that any day. (The whole *point* of extension methods is to be able to hide the not-extremely-readable code away from sight, so priorities here should be straightforward...). – Tomas Aschan Nov 08 '16 at 17:13
  • 2
    @Jay, and since .NET Core 3.0 (from 2019) there is an overload that skips that argument entirely, so `var result = input.Zip(input.Skip(1));`. – Jeppe Stig Nielsen Feb 06 '23 at 20:14
46

Rather than require a tuple (pair) type, why not just accept a selector:

public static IEnumerable<TResult> Pairwise<TSource, TResult>(this IEnumerable<TSource> source, Func<TSource, TSource, TResult> resultSelector)
{
    TSource previous = default(TSource);

    using (var it = source.GetEnumerator())
    {
        if (it.MoveNext())
            previous = it.Current;

        while (it.MoveNext())
            yield return resultSelector(previous, previous = it.Current);
    }
}

Which allows you to skip the intermediate object if you want:

string[] items = new string[] { "a", "b", "c", "d" };
var pairs = items.Pairwise((x, y) => string.Format("{0},{1}", x, y));

foreach(var pair in pairs)
    Console.WriteLine(pair);

Or you can use an anonymous type:

var pairs = items.Pairwise((x, y) => new { First = x, Second = y });

Update: I just implemented this on a real project and used C# 7.0 ValueTuple instead:

public static IEnumerable<(T, T)> Pairwise<T>(this IEnumerable<T> source)
{
    var previous = default(T);
    using (var it = source.GetEnumerator())
    {
        if (it.MoveNext())
            previous = it.Current;

        while (it.MoveNext())
            yield return (previous, previous = it.Current);
    }
}
dahlbyk
  • 75,175
  • 8
  • 100
  • 122
  • 3
    I wonder about the execution order in `yield return …(previous, previous = …)`. Does the C# language guarantee that the first argument will be prepared before the second argument is evaluated? – stakx - no longer contributing Jan 27 '13 at 09:28
  • I'm reasonably certain it does, but I have to check the spec to be sure. – dahlbyk Jan 28 '13 at 17:46
  • 4
    Yes it does, see [section 7.4.1](http://msdn.microsoft.com/en-us/library/aa691335%28v=vs.71%29.aspx) of the C# spec. *"During the run-time processing of a function member invocation, the expressions or variable references of an argument list are evaluated in order, from left to right, as follows:..."* – Scott Chamberlain Jul 13 '13 at 21:57
  • 1
    Just wanted to chime in that I've done some performance analysis of this version and using a Queue with Dequeue/Peek and the Zip method. The Queue method is actually twice as fast as the GetEnumerator method and 6 times faster than Zip, and I consider it more readable than both. e.g. var queue = new Queue(enumerable); while(queue.Count() > 1){ yield return func(queue.Dequeue,queue.Peek); } – Novaterata Sep 24 '14 at 13:29
  • Very interesting... can you post your benchmark to a Gist or something? – dahlbyk Sep 24 '14 at 20:35
  • @Novaterata Then your benchmarks are flawed. `GetEnumerator()` is always superior if used correctly. For a concrete `List` you do not want to use its general enumerator, as explained in my [other answer](https://stackoverflow.com/questions/200574/linq-equivalent-of-foreach-for-ienumerablet/62663221#62663221). – l33t Mar 29 '22 at 04:22
12

The easiest way is to use ReactiveExtensions

using System.Reactive;
using System.Reactive.Linq;

and make yourself an extension method to kit bash this together

public static IEnumerable<IList<T>> Buffer<T>(this IEnumerable<T> seq, int bufferSize, int stepSize)
{
    return seq.ToObservable().Buffer(bufferSize, stepSize).ToEnumerable();
}
bradgonesurfing
  • 30,949
  • 17
  • 114
  • 217
  • 8
    Rather than coerce to/from observable, the Interactive Extensions companion to Rx (https://www.nuget.org/packages/ix-main) ships with an `IEnumerable` version of `Buffer()`: https://github.com/Reactive-Extensions/Rx.NET/blob/2252cb4edbb25aca12005b9a912311edd2f095f3/Ix.NET/Source/System.Interactive/EnumerableEx.Single.cs#L211-L229 – dahlbyk May 01 '16 at 20:39
6

Just for convenience, here is a selector-less version of @dahlbyk's answer.

public static IEnumerable<Tuple<T, T>> Pairwise<T>(this IEnumerable<T> enumerable)
{
    var previous = default(T);

    using (var e = enumerable.GetEnumerator())
    {
        if (e.MoveNext())
            previous = e.Current;

        while (e.MoveNext())
            yield return Tuple.Create(previous, previous = e.Current);
    }
}
James Holwell
  • 938
  • 7
  • 10
  • I think this is even cleaner than the original. In modern C# this can be used as: foreach (var (previous, next) in Enumerable.Range(0, 10).PairWise()) Console.WriteLine(previous + "-" + next); – Zar Shardan Nov 08 '19 at 13:44
5

A little late to the party, but as an alternative to all these extension methods, one might use an actual "sliding" Collection to hold (and discard) the data.

Here is one I ended up making today:

public class SlidingWindowCollection<T> : ICollection<T>
{
    private int _windowSize;
    private Queue<T> _source;

    public SlidingWindowCollection(int windowSize)
    {
        _windowSize = windowSize;
        _source = new Queue<T>(windowSize);
    }

    public void Add(T item)
    {
        if (_source.Count == _windowSize)
        {
            _source.Dequeue();
        }
        _source.Enqueue(item);
    }

    public void Clear()
    {
        _source.Clear();
    }

    ...and just keep forwarding all other ICollection<T> methods to _source.
}

Usage:

int pairSize = 2;
var slider = new SlidingWindowCollection<string>(pairSize);
foreach(var item in items)
{
    slider.Add(item);
    Console.WriteLine(string.Join(", ", slider));
}
Sphinxxx
  • 12,484
  • 4
  • 54
  • 84
4

Here is my solution using a Stack. It is short and concise.

string[] items = new string[] { "a", "b", "c", "d" };

Stack<string> stack = new Stack<string>(items.Reverse());

while(stack.Count > 1)
{
  Console.WriteLine("{0},{1}", stack.Pop(), stack.Peek());
}

You can take the same concept and use a queue which avoids the need for reversing the items and is even simpler:

var queue = new Queue<string>(items);

while (queue.Count > 1)
{
   Console.WriteLine("{0},{1}", queue.Dequeue(), queue.Peek());
}

A short word about performance:

I believe it's important to realize that unless you know that a task is causing a bottleneck in your real application, it's probably not worth working out what the truly fastest way of doing is. Instead, write the code which does the job for you. Also, use code you can remember, so it easily flows out of your hand the next time you need it.

Nevertheless, in case you care for some performance data for 10.000.000 random strings:

Run #1
  InputZip             00:00:00.7355567
  PairwiseExtension    00:00:00.5290042
  Stack                00:00:00.6451204
  Queue                00:00:00.3245580
  ForLoop              00:00:00.7808004
  TupleExtension       00:00:03.9661995

Run #2
  InputZip             00:00:00.7386347
  PairwiseExtension    00:00:00.5369850
  Stack                00:00:00.6910079
  Queue                00:00:00.3246276
  ForLoop              00:00:00.8272945
  TupleExtension       00:00:03.9415258

Tested using Jon Skeet's micro benchmarking tool.

If you want to take a look at the source for the test go here: gist here

Postlagerkarte
  • 6,600
  • 5
  • 33
  • 52
  • this is very inefficient, especially if the collection has lots of elements. Your space complexity is O(n) instead of O(1). Your time complexity is also O(n) as other solutions here, but still slower by a constant factor. – Zar Shardan Nov 08 '19 at 13:28
  • This isn't about premature optimization. You are doing more work than necessary with more code. This is just bad design. – Zar Shardan Nov 08 '19 at 17:30
  • Well, some of the better solutions on this page are generics methods that are ready to use and can be copy - pasted into a project with some minor parameter checking. Yours is just a 3 lines idea. And not a good one. You're increasing space complexity from very palatable O(1) to mediocre O(n) and doubling execution time at zero gain in anything. – Zar Shardan Nov 08 '19 at 19:05
  • @ZarShardan: I updated my answer with some benchmark results. – Postlagerkarte Nov 09 '19 at 13:57
  • you're not testing the same thing. in PairwiseExtension you call String.Format which is far more taxing than Tuple.Create. Try replacing it with James Holwell's solution and use ValueTuple instead of Tuple while you at it. Furthermore, often iteration ends early. So maybe add .Take(N/2) before the ToList(). call, and in your solutions Count > N/2 where N is 10000000 in your test. – Zar Shardan Nov 09 '19 at 15:49
  • 3
    Indeed the string.format was influencing the results - i copy / pasted the original solutions - fixed that and changed all types to ValueTuple (good suggestion) also switched the test to use James Holwell's solution. Looking at the results, I don't think it is fair to call any of the given solutions "inefficient" – Postlagerkarte Nov 09 '19 at 16:55
  • 1
    upvoted for the effort to test this. Still don't like your O(n) space solution :D – Zar Shardan Nov 09 '19 at 19:07
2

Something like this:

public static IEnumerable<TResult> Pairwise<T, TResult>(this IEnumerable<T> enumerable, Func<T, T, TResult> selector)
{
    var previous = enumerable.First();
    foreach (var item in enumerable.Skip(1))
    {
        yield return selector(previous, item);
        previous = item;
    }
}
Quiz
  • 21
  • 1
2

Forgive me if I'm overlooking something, but why not something simple, like a for loop?:

public static List <int []> ListOfPairs (int [] items)
{
    List <int> output = new List <int>();
    for (int i=0; i < items.Length-1; i++)
    {
        Int [] pair = new int [2];
        pair [0]=items [i];
        pair [1]=items [i+1];
        output.Add (pair);
    }
    return output;
}
Tim Earley
  • 21
  • 1
2

Expanding on the previous answer to avoid of O(n2) approach by explicitly using the passed iterator:

public static IEnumerable<IEnumerable<T>> Tuples<T>(this IEnumerable<T> input, int groupCount) {
  if (null == input) throw new ArgumentException("input");
  if (groupCount < 1) throw new ArgumentException("groupCount");

  var e = input.GetEnumerator();

  bool done = false;
  while (!done) {
    var l = new List<T>();
    for (var n = 0; n < groupCount; ++n) {
      if (!e.MoveNext()) {
        if (n != 0) {
          yield return l;
        }
        yield break;
      }
      l.Add(e.Current);
    }
    yield return l;
  }
}

For C# 2, before extension methods, drop the "this" from the input parameter and call as a static method.

Community
  • 1
  • 1
Richard
  • 106,783
  • 21
  • 203
  • 265
  • 2
    This doesn't return the result the question asks for. `Enumerable.Range(1, 5).Tuples(2)` returns `{{1, 2}, {3, 4}, {5}}` instead of the desired `{{1, 2}, {2, 3}, {3, 4}, {4, 5}}` that is a sliding window. – Sammy S. Apr 27 '16 at 08:11
1

C# 3.0 solution (sorry:)

public static IEnumerable<IEnumerable<T>> Tuples<T>(this IEnumerable<T> sequence, int nTuple)
{
    if(nTuple <= 0) throw new ArgumentOutOfRangeException("nTuple");

    for(int i = 0; i <= sequence.Count() - nTuple; i++)
        yield return sequence.Skip(i).Take(nTuple);
}

This isn't the most performant in the world, but it's sure pleasant to look at.

Really, the only thing making this a C# 3.0 solution is the .Skip.Take construct, so if you just change that to adding the elements in that range to a list instead, it should be golden for 2.0. That said, it's still not performant.

mqp
  • 70,359
  • 14
  • 95
  • 123
  • 6
    Not the most performant? This is an O(n*n) implementation! For a small 10 item list, the entire list was traversed 20 times – configurator Feb 23 '09 at 14:58
  • 2
    That's true, but it's also two lines of (real) code and is an obviously simple implementation. It's up to the OP to decide whether he needs a fast solution -- maybe he only needs this operation on lists of a couple dozen items. – mqp Feb 23 '09 at 15:04
0

The F# Seq module defines the pairwise function over IEnumerable<T>, but this function is not in the .NET framework.

If it were already in the .NET framework, instead of returning pairs it would probably accept a selector function due to the lack of support for tuples in languages like C# and VB.

var pairs = ns.Pairwise( (a, b) => new { First = a, Second = b };

I don't think any of the answers here really improve on your simple iterator implementation, which seemed the most natural to me (and the poster dahlbyk by the looks of things!) too.

Community
  • 1
  • 1
Pete Montgomery
  • 4,060
  • 3
  • 30
  • 40
0

Alternate Pairs implementation, using last pair to store previous value:

static IEnumerable<Pair<T, T>> Pairs( IEnumerable<T> collection ) {
  Pair<T, T> pair = null;
  foreach( T item in collection ) {
    if( pair == null )
      pair = Pair.Create( default( T ), item );
    else
      yield return pair = Pair.Create( pair.Second, item );
  }
}

Simple Window implementation (only safe for private use, if caller does not save returned arrays; see note):

static IEnumerable<T[]> Window( IEnumerable<T> collection, int windowSize ) {
  if( windowSize < 1 )
    yield break;

  int index = 0;
  T[] window = new T[windowSize];
  foreach( var item in collection ) {
    bool initializing = index < windowSize;

    // Shift initialized window to accomodate new item.
    if( !initializing )
      Array.Copy( window, 1, window, 0, windowSize - 1 );

    // Add current item to window.
    int itemIndex = initializing ? index : windowSize - 1;
    window[itemIndex] = item;

    index++;
    bool initialized = index >= windowSize;
    if( initialized )
      //NOTE: For public API, should return array copy to prevent 
      // modifcation by user, or use a different type for the window.
      yield return window;
  }
}

Example use:

for( int i = 0; i <= items.Length; ++i ) {
  Console.WriteLine( "Window size {0}:", i );
  foreach( string[] window in IterTools<string>.Window( items, i ) )
    Console.WriteLine( string.Join( ", ", window ) );
  Console.WriteLine( );
}
Emperor XLII
  • 13,014
  • 11
  • 65
  • 75
0

I created a slightly modified version of the late-2020-updated code in @dahlbyk's answer. It is better suited for projects with nullable reference types enabled (<Nullable>enable</Nullable>). I also added basic docs.

/// <summary>
/// Enumerates over tuples of pairs of the elements from the original sequence. I.e. { 1, 2, 3 } becomes { (1, 2), (2, 3) }. Note that { 1 } becomes { }.
/// </summary>
public static IEnumerable<(T, T)> Pairwise<T>(this IEnumerable<T> source)
{
    using var it = source.GetEnumerator();
        
    if (!it.MoveNext())
        yield break;

    var previous = it.Current;

    while (it.MoveNext())
        yield return (previous, previous = it.Current);
}
Georg Jung
  • 949
  • 10
  • 27
0

This will make a single pass through the IEnumerable.

items.Aggregate(new List<string[]>(), (list, next) => { if (list.Count > 0) list[^1][1] = next; list.Add(new[] { next, next }); return list; }, (list) => { list.RemoveAt(list.Count -1); return list; });
Thomas Heijtink
  • 427
  • 5
  • 7
  • 1
    The last chuck might have a single element, in which case your code will fail with an `ArgumentOutOfRangeException`. – Theodor Zoulias Feb 04 '23 at 17:50
  • @TheodorZoulias you are correct. Depending on how much control you have over the contents of your `items` you might have to take unintended number of items into account by using the `ElementAtOrDefault` method. When you increase the `Chunk` you will have to use `ElementAtOrDefault` method for all subsequent index after the `0`. – Thomas Heijtink Feb 04 '23 at 19:08
  • 1
    I think that this expression is preferable: `pairs.Length > 1 ? pairs[1] : default`. The `ElementAtOrDefault` will allocate an enumerator for each chuck, and the `Chuck` operator allocates an array per chunk already. – Theodor Zoulias Feb 04 '23 at 19:20
  • Amended the post in cases where one has no control over the contents of the list and performance is somewhat important. – Thomas Heijtink Feb 04 '23 at 19:40
  • Now it's better, but to be honest I am not a fan of casually introducing extension methods for common types, unless these methods can be useful in a wide range of scenarios. Admittedly though your answer is better than some heavily upvoted answers, like the one that double-enumerates the input with `input.Zip(input.Skip(1)`. – Theodor Zoulias Feb 04 '23 at 19:54
  • I guess it all depends what place this has in the code base. Is it a heavily used returning pattern but within the same library, one could at least consider to make it internal. – Thomas Heijtink Feb 04 '23 at 19:59
  • 1
    After reading the question more carefully, your answer does not produce the required output. The OP wants `("a", "b"), ("b", "c"), ("c", "d")`, not `("a", "b"), ("c", "d")`. – Theodor Zoulias Feb 05 '23 at 15:18
  • Yeah you are right. Changed the answer. This was even easier. – Thomas Heijtink Feb 06 '23 at 20:06
  • My understanding is that the OP has defined their problem in terms of a source `IEnumerable`, not an array or `IList`. The fact that they used an array as source in the example is accidental. – Theodor Zoulias Feb 06 '23 at 20:10
  • Haha, well we will get there ;-) – Thomas Heijtink Feb 06 '23 at 20:16
  • 1
    Now ([revision 5](https://stackoverflow.com/revisions/75347040/5)) you are enumerating the source multiple times, like in the flawed Ian Mercer's answer. An enumerable sequence might be deferred, and return different results each time (and different number of results). You can't assume that it is backed by a materialized collection. – Theodor Zoulias Feb 06 '23 at 20:23
  • Last one for today. – Thomas Heijtink Feb 06 '23 at 21:07
  • Upvoted for the effort. :-) – Theodor Zoulias Feb 06 '23 at 21:33
-1

New C# language allows to do something like this:

        var pairlist = new (string, string)[] { ("a", "b"), ("b", "c"), ("c", "d") };

        foreach (var pair in pairlist)
        {
            //do something with pair.Item1 & pair.Item2
TarmoPikaro
  • 4,723
  • 2
  • 50
  • 62