8

I have:-

IEnumerable<IEnumerable<T>> items;

and I'd like to create:-

IEnumerable<IEnumerable<T>> results;

where the first item in "results" is an IEnumerable of the first item of each of the IEnumerables of "items", the second item in "results" is an IEnumerable of the second item of each of "items", etc.

The IEnumerables aren't necessarily the same lengths. If some of the IEnumerables in items don't have an element at a particular index, then I'd expect the matching IEnumerable in results to have fewer items in it.

For example:-

items = { "1", "2", "3", "4" } , { "a", "b", "c" };
results = { "1", "a" } , { "2", "b" }, { "3", "c" }, { "4" };

Edit: Another example (requested in comments):-

items = { "1", "2", "3", "4" } , { "a", "b", "c" }, { "p", "q", "r", "s", "t" };
results = { "1", "a", "p" } , { "2", "b", "q" }, { "3", "c", "r" }, { "4", "s" }, { "t" };

I don't know in advance how many sequences there are, nor how many elements are in each sequence. I might have 1,000 sequences with 1,000,000 elements in each, and I might only need the first ~10, so I'd like to use the (lazy) enumeration of the source sequences if I can. In particular I don't want to create a new data structure if I can help it.

Is there a built-in method (similar to IEnumerable.Zip) that can do this?

Is there another way?

Iain Galloway
  • 18,669
  • 6
  • 52
  • 73
  • What happens if items, contains three sequences? – Richard Szalay Oct 21 '10 at 15:51
  • This is similar to ["How to iterate over two arrays at once?"](http://stackoverflow.com/questions/496704/how-to-iterate-over-two-arrays-at-once) (that covers the case of N = 2) – James McNellis Oct 21 '10 at 15:51
  • Check Eric Lippert's blog (which expanded from an answer to a SO question) about computing a cartesian product over arbitrarily many sequences. http://blogs.msdn.com/b/ericlippert/archive/2010/06/28/computing-a-cartesian-product-with-linq.aspx – Anthony Pegram Oct 21 '10 at 15:53
  • Apart from the requirement on different lengths, this sound a lot like the transpose of a matrix. I wrote a Transpose method for LINQ here: http://stackoverflow.com/questions/2070356/find-common-prefix-of-strings/2070434#2070434 – dtb Oct 21 '10 at 15:57
  • @Richard: Added an example, @James: Yes, but it's easy if you know in advance how many sequences there are :(, @Anthony: Alas, Eric can use an accumulator and so doesn't *need* the pivot, @dtb: Your transpose method looks like a good start. – Iain Galloway Oct 21 '10 at 16:03
  • You have good answers already and I'm not at a computer to test, but you should be able to call `MakeGenericMethod` on `Zip` and get the required method at runtime. – Mark Hurd Jan 03 '14 at 07:20

6 Answers6

7

Now lightly tested and with working disposal.

public static class Extensions
{
  public static IEnumerable<IEnumerable<T>> JaggedPivot<T>(
    this IEnumerable<IEnumerable<T>> source)
  {
    List<IEnumerator<T>> originalEnumerators = source
      .Select(x => x.GetEnumerator())
      .ToList();

    try
    {
      List<IEnumerator<T>> enumerators = originalEnumerators
        .Where(x => x.MoveNext()).ToList();

      while (enumerators.Any())
      {
        List<T> result = enumerators.Select(x => x.Current).ToList();
        yield return result;
        enumerators = enumerators.Where(x => x.MoveNext()).ToList();
      }
    }
    finally
    {
      originalEnumerators.ForEach(x => x.Dispose());
    }
  } 
}

public class TestExtensions
{
  public void Test1()
  {
    IEnumerable<IEnumerable<int>> myInts = new List<IEnumerable<int>>()
    {
      Enumerable.Range(1, 20).ToList(),
      Enumerable.Range(21, 5).ToList(),
      Enumerable.Range(26, 15).ToList()
    };

    foreach(IEnumerable<int> x in myInts.JaggedPivot().Take(10))
    {
      foreach(int i in x)
      {
        Console.Write("{0} ", i);
      }
      Console.WriteLine();
    }
  }
}
Amy B
  • 108,202
  • 21
  • 135
  • 185
  • 3
    Note that you're not disposing of any of the iterators, which could be bad depending on what they're iterating over. – Jon Skeet Oct 21 '10 at 16:02
  • Innnnnnteresting. I didn't know finally would be activated in these iterator methods when there's laziness happening. Makes sense though, as the caller will eventually call Dispose on the lazy IEnumerator. – Amy B Oct 21 '10 at 18:54
  • I've accepted this answer, but make sure you read Jon Skeet's answer as well. – Iain Galloway Oct 25 '10 at 08:20
4

It's reasonably straightforward to do if you can guarantee how the results are going to be used. However, if the results might be used in an arbitrary order, you may need to buffer everything. Consider this:

var results = MethodToBeImplemented(sequences);
var iterator = results.GetEnumerator();
iterator.MoveNext();
var first = iterator.Current;
iterator.MoveNext();
var second = iterator.Current;
foreach (var x in second)
{
    // Do something
}
foreach (var x in first)
{
    // Do something
}

In order to get at the items in "second" you'll have to iterate over all of the subsequences, past the first items. If you then want it to be valid to iterate over the items in first you either need to remember the items or be prepared to re-evaluate the subsequences.

Likewise you'll either need to buffer the subsequences as IEnumerable<T> values or reread the whole lot each time.

Basically it's a whole can of worms which is difficult to do elegantly in a way which will work pleasantly for all situations :( If you have a specific situation in mind with appropriate constraints, we may be able to help more.

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • Thanks! Thankfully I can guarantee that the results are going to be used in order. Does that alleviate the need for the calls to "ToList" in DavidB's answer? (Going to test now) – Iain Galloway Oct 21 '10 at 16:07
  • @Iain: I suspect so, although it may be quite complex to get it right. You'll still need to buffer the subsequence references themselves as `IEnumerable`. Hmmm. Instead of returning an `IEnumerable>`, could you pass in an action to be executed on each value, or something like that? It would make things a lot simpler! – Jon Skeet Oct 21 '10 at 16:13
  • I don't think I can pass in an Action (good idea though, I'll keep it in mind for similar problems in the future!). At this stage, I'm thinking that if making it elegant will make it super-complicated, I'll just go ahead and inline it all. – Iain Galloway Oct 21 '10 at 16:20
1

Based on David B's answer, this code should perform better:

public static IEnumerable<IEnumerable<T>> JaggedPivot<T>(
    this IEnumerable<IEnumerable<T>> source)
{
    var originalEnumerators = source.Select(x => x.GetEnumerator()).ToList();
    try
    {
        var enumerators =
            new List<IEnumerator<T>>(originalEnumerators.Where(x => x.MoveNext()));

        while (enumerators.Any())
        {
            yield return enumerators.Select(x => x.Current).ToList();
            enumerators.RemoveAll(x => !x.MoveNext());
        }
    }
    finally
    {
        originalEnumerators.ForEach(x => x.Dispose());
    }
}

The difference is that the enumerators variable isn't re-created all the time.

Community
  • 1
  • 1
tm1
  • 1,180
  • 12
  • 28
0

You could compose existing operators like this,

IEnumerable<IEnumerable<int>> myInts = new List<IEnumerable<int>>()
    {
        Enumerable.Range(1, 20).ToList(),
        Enumerable.Range(21, 5).ToList(),
        Enumerable.Range(26, 15).ToList()
    };

myInts.SelectMany(item => item.Select((number, index) => Tuple.Create(index, number)))
      .GroupBy(item => item.Item1)
      .Select(group => group.Select(tuple => tuple.Item2));
Pragmateek
  • 13,174
  • 9
  • 74
  • 108
Arun Aravind
  • 1,238
  • 10
  • 9
0

Here's one that is a bit shorter, but no doubt less efficient:

Enumerable.Range(0,items.Select(x => x.Count()).Max())
    .Select(x => items.SelectMany(y => y.Skip(x).Take(1)));
diceguyd30
  • 2,742
  • 20
  • 18
0

What about this?

        List<string[]> items = new List<string[]>()
        {
            new string[] { "a", "b", "c" },
            new string[] { "1", "2", "3" },
            new string[] { "x", "y" },
            new string[] { "y", "z", "w" }
        };

        var x = from i in Enumerable.Range(0, items.Max(a => a.Length))
                select from z in items
                       where z.Length > i
                       select z[i];
as-cii
  • 12,819
  • 4
  • 41
  • 43
  • If I had the data in memory already in a form that I could do random-access on it, it would be this easy. Unfortunately I need the lazy enumeration of the source IEnumerables. Calling z[i] - or z.ElementAt(i) - undermines that. – Iain Galloway Oct 22 '10 at 13:42