17

I'm trying to track down a bug in our code. I've boiled it down to the snippet below. In the example below I have a grid of ints (a list of rows), but I want to find the indexes of the columns that have a 1. The implementation of this is to create an enumerator for each row and step through each column in turn by keeping the enumerators in step.

class Program
{
    static void Main(string[] args)
    {
        var ints = new List<List<int>> {
            new List<int> {0, 0, 1},    // This row has a 1 at index 2
            new List<int> {0, 1, 0},    // This row has a 1 at index 1
            new List<int> {0, 0, 1}     // This row also has a 1 at index 2
        };
        var result = IndexesWhereThereIsOneInTheColumn(ints);

        Console.WriteLine(string.Join(", ", result)); // Expected: "1, 2"
        Console.ReadKey();
    }


    private static IEnumerable<int> IndexesWhereThereIsOneInTheColumn(
        IEnumerable<List<int>> myIntsGrid)
    {
        var enumerators = myIntsGrid.Select(c => c.GetEnumerator()).ToList();

        short i = 0;
        while (enumerators.All(e => e.MoveNext())) {
            if (enumerators.Any(e => e.Current == 1))
                yield return i;
            i++;

            if (i > 1000)
                throw new Exception("You have gone too far!!!");
        }
    }

}

However I have noticed that MoveNext() is not remembered each time around the while loop. MoveNext() always returns true, and Current is always 0. Is this a purposeful feature of Linq to make it more side effect free?

I noticed that this works:

    private static IEnumerable<int> IndexesWhereThereIsOneInTheColumn(
        IEnumerable<List<int>> myIntsGrid)
    {
        var enumerators = myIntsGrid.Select(c => 
            c.ToArray().GetEnumerator()).ToList(); // added ToArray() 

        short i = 0;
        while (enumerators.All(e => e.MoveNext())) {
            if (enumerators.Any(e => (int)e.Current == 1)) // added cast to int
                yield return i;
            i++;
        }
    }

So is this just a problem with List?

Jonny
  • 2,509
  • 23
  • 42

6 Answers6

17

It is because the enumerator of List<T> is a struct whereas the enumerator of Array is a class.

So when you call Enumerable.All with the struct, copy of enumerator is made and passed as a parameter to Func since structs are copied by value. So e.MoveNext is called on the copy, not the original.

Try this:

Console.WriteLine(new List<int>().GetEnumerator().GetType().IsValueType);
Console.WriteLine(new int[]{}.GetEnumerator().GetType().IsValueType);

It prints:

True
False
Luca Cremonesi
  • 144
  • 2
  • 3
  • 13
Sriram Sakthivel
  • 72,067
  • 7
  • 111
  • 189
  • It's not due to boxing. The OP's code mutates a copy, but not a boxed copy. – CodesInChaos May 08 '14 at 10:33
  • @CodesInChaos Good catch, revised my answer. See if that makes sense :) – Sriram Sakthivel May 08 '14 at 10:40
  • That was my first idea when I read the question, but I couldn't imagine that enumerators would actually be implemented as value type sometimes. Quite error-prone. – Mene May 08 '14 at 11:50
  • 2
    @Mene: Imagine it; it's true! It is error-prone, but remember 99.9999% of the time the enumerator is only seen by the code generated for the `foreach` loop; creating your own unboxed enumerators is a rare scenario. The design team felt that the performance win of avoiding collection pressure was worth the very rare cases like this one where the code is confusing. – Eric Lippert May 14 '14 at 16:41
14

As Sriram Sakthivel's answer says the issue is due to lack of boxing and accidentally the list enumerator implementation being a struct, not a reference type. Usually, one would not expect the value-type behavior for an enumerator, as most are either exposed by the IEnumerator/IEnumerator<T> interfaces, or are reference types themselves. A quick way to go around this is to change this line

var enumerators = myIntsGrid.Select(c => c.GetEnumerator()).ToList();

to

var enumerators 
    = myIntsGrid.Select(c => (IEnumerator) c.GetEnumerator()).ToList();

instead.

The above code will construct a list of already boxed enumerators, which will be treated as reference type instances, because of the interface cast. From that moment on, they should behave as you expect them to in your later code.


If you need a generic enumerator (to avoid casts when latter using the enumerator.Current property), you can cast to the appropriate generic IEnumerator<T> interface:

c => (IEnumerator<int>) c.GetEnumerator()

or even better

c => c.GetEnumerator() as IEnumerator<int>

The as keyword is said to perform a lot better than direct casts, and in the case of a loop it could bring an essential performance benefit. Just be careful that as returns null if the cast fails As per Flater's request from comments:. In the OP's case, it is guaranteed the enumerator implements IEnumerator<int>, so it is safe to go for an as cast.

Community
  • 1
  • 1
Ivaylo Slavov
  • 8,839
  • 12
  • 65
  • 108
  • 2
    Nitpicking: use `(IEnumerator)` instead of `(IEnumerator)` so you don't have to cast `e.Current` inside the `while` loop. – sloth May 08 '14 at 09:45
  • @DominicKexel, good point, I will incorporate it as an edit. Thanks – Ivaylo Slavov May 08 '14 at 09:47
  • 1
    It's not due to boxing. The OP's code mutates a copy, but not a boxed copy. – CodesInChaos May 08 '14 at 10:32
  • @CodesInChaos, correct, the expression __boxing__ is not appropriately used. The problem is actually the _lack of_ boxing that enables the value type copying behavior. I edited this, thanks! – Ivaylo Slavov May 08 '14 at 10:55
  • 1
    @IvayloSlavov: It may be important to note that `foo as Bar` returns null when the cast has failed, whereas `(Bar) foo` throws an exception. They're both useful in their own ways, depends how you want to handle faulty casting :) – Flater May 08 '14 at 14:04
  • @Flater, yes, it is important to know that. I had this in a previous version of the post, but I removed it because I thought I was too verbose. In fact, the question's code uses `List` which is _guaranteed_ to expose a not-null `IEnumerator`, so the `as` keyword is safe to use. Thanks anyway for pointing it out! – Ivaylo Slavov May 09 '14 at 06:34
  • 1
    @IvayloSlavov: Yeah I assumed it was irrelevant in this case. But someone could conclude from your post that `foo as Bar` is always better. It's for those people that I mentioned it :) – Flater May 09 '14 at 06:37
  • @Flater, I added this again as a note for devs to have in mind. – Ivaylo Slavov May 09 '14 at 06:44
2

Alternatively, you could do it with a lambda extension

var ids = Enumerable.Range(0,ints.Max (row => row.Count)).
      Where(col => ints.Any(row => (row.Count>col)? row[col] == (1) : false));

or

var ids = Enumerable.Range(0,ints.Max (row=> row.Count)).
      Where(col => ints.Any (row => row.ElementAtOrDefault(col) == 1));
Tarec
  • 3,268
  • 4
  • 30
  • 47
1

Here's a simple implementation using loops and yield:

private static IEnumerable<int> IndexesWhereThereIsOneInTheColumn(
    IEnumerable<List<int>> myIntsGrid)
{
    for (int i=0; myIntsGrid.Max(l=>l.Count) > i;i++)
    {
        foreach(var row in myIntsGrid)
        {
            if (row.Count > i && row[i] == 1)
            {
                yield return i;
                break;
            }
        }
    }       
}

Alternatively, use this inside the for loop:

if (myIntsGrid.Any(row => row.Count > i && row[i] == 1)) yield return i;
Rik
  • 28,507
  • 14
  • 48
  • 67
1

Just for fun, here's a neat LINQ query that won't cause hard-to-trace side effects in your code:

IEnumerable<int> IndexesWhereThereIsOneInTheColumn(IEnumerable<IEnumerable<int>> myIntsGrid)
{
    return myIntsGrid
        // Collapse the rows into a single row of the maximum value of all rows
        .Aggregate((acc, x) => acc.Zip(x, Math.Max))
        // Enumerate the row
        .Select((Value,Index) => new { Value, Index })
        .Where(x => x.Value == 1)
        .Select(x => x.Index);
}
Alex
  • 7,639
  • 3
  • 45
  • 58
  • Solves the posted (cut down) problem, but I'm not actually using ints in the grid for the real problem. However, I might be able to create an aggregator that bubbles the value I want to the top. – Jonny May 09 '14 at 09:23
  • Sure, you'd just need to change `Math.Max` to a function that picks your chosen value and then obviously the `Where` operator too – Alex May 09 '14 at 09:40
0

Why can't you just get those indexes like this:

var result = ints.Select (i => i.IndexOf(1)).Distinct().OrderBy(i => i);

Seems to be much easier...

st4hoo
  • 2,196
  • 17
  • 25
  • 1
    That will work for this example, but I wanted to reduce the amount of code that I posted. My actual bug is more involved, but the code I posted demonstrates the crux of the problem. – Jonny May 08 '14 at 09:33