20

I've got a work around for this issue, but I'm trying to figure out why it works . Basically, I'm looping through a list of structs using foreach. If I include a LINQ statement that references the current struct before I call a method of the struct, the method is unable to modify the members of the struct. This happens regardless of whether the LINQ statement is even called. I was able to work around this by assigning the value I was looking for to a variable and using that in the LINQ, but I would like to know what is causing this. Here's an example I created.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace WeirdnessExample
{
    public struct RawData
    {
        private int id;

        public int ID
        {
            get{ return id;}
            set { id = value; }
        }

        public void AssignID(int newID)
        {
            id = newID;
        }
    }

    public class ProcessedData
    {
        public int ID { get; set; }
    }

    class Program
    {
        static void Main(string[] args)
        {
            List<ProcessedData> processedRecords = new List<ProcessedData>();
            processedRecords.Add(new ProcessedData()
            {
                ID = 1
            });


            List<RawData> rawRecords = new List<RawData>();
            rawRecords.Add(new RawData()
            {
                ID = 2
            });


            int i = 0;
            foreach (RawData rawRec in rawRecords)
            {
                int id = rawRec.ID;
                if (i < 0 || i > 20)
                {
                    List<ProcessedData> matchingRecs = processedRecords.FindAll(mr => mr.ID == rawRec.ID);
                }

                Console.Write(String.Format("With LINQ: ID Before Assignment = {0}, ", rawRec.ID)); //2
                rawRec.AssignID(id + 8);
                Console.WriteLine(String.Format("ID After Assignment = {0}", rawRec.ID)); //2
                i++;
            }

            rawRecords = new List<RawData>();
            rawRecords.Add(new RawData()
            {
                ID = 2
            });

            i = 0;
            foreach (RawData rawRec in rawRecords)
            {
                int id = rawRec.ID;
                if (i < 0)
                {
                    List<ProcessedData> matchingRecs = processedRecords.FindAll(mr => mr.ID == id);
                }
                Console.Write(String.Format("With LINQ: ID Before Assignment = {0}, ", rawRec.ID)); //2
                rawRec.AssignID(id + 8);
                Console.WriteLine(String.Format("ID After Assignment = {0}", rawRec.ID)); //10
                i++;
            }

            Console.ReadLine();
        }
    }
}
Firo
  • 30,626
  • 4
  • 55
  • 94
user1097991
  • 251
  • 2
  • 5
  • 11
    It's not clear what your example is trying to show. I strongly suspect this is just another example of why mutable structs are a really bad idea. – Jon Skeet Nov 28 '12 at 17:09
  • I think I'm starting to understand it. I'm trying to reproduce this in a much simpler form. – Jon Skeet Nov 28 '12 at 17:16
  • 3
    Note that nothing in your code uses LINQ at all. You're using List.FindAll, which was present in .NET 2.0, and a lambda expression - which are often used *with* LINQ, but aren't really part of LINQ. This is part of what threw me off to start with - I was trying to find anything LINQ-related... – Jon Skeet Nov 28 '12 at 17:48

5 Answers5

39

Okay, I've managed to reproduce this with a rather simpler test program, as shown below, and I now understand it. Admittedly understanding it doesn't make me feel any less nauseous, but hey... Explanation after code.

using System;
using System.Collections.Generic;

struct MutableStruct
{
    public int Value { get; set; }

    public void AssignValue(int newValue)
    {
        Value = newValue;
    }
}

class Test
{
    static void Main()
    {
        var list = new List<MutableStruct>()
        {
            new MutableStruct { Value = 10 }
        };

        Console.WriteLine("Without loop variable capture");
        foreach (MutableStruct item in list)
        {
            Console.WriteLine("Before: {0}", item.Value); // 10
            item.AssignValue(30);
            Console.WriteLine("After: {0}", item.Value);  // 30
        }
        // Reset...
        list[0] = new MutableStruct { Value = 10 };

        Console.WriteLine("With loop variable capture");
        foreach (MutableStruct item in list)
        {
            Action capture = () => Console.WriteLine(item.Value);
            Console.WriteLine("Before: {0}", item.Value);  // 10
            item.AssignValue(30);
            Console.WriteLine("After: {0}", item.Value);   // Still 10!
        }
    }
}

The difference between the two loops is that in the second one, the loop variable is captured by a lambda expression. The second loop is effectively turned into something like this:

// Nested class, would actually have an unspeakable name
class CaptureHelper
{
    public MutableStruct item;

    public void Execute()
    {
        Console.WriteLine(item.Value);
    }
}

...
// Second loop in main method
foreach (MutableStruct item in list)
{
    CaptureHelper helper = new CaptureHelper();
    helper.item = item;
    Action capture = helper.Execute;

    MutableStruct tmp = helper.item;
    Console.WriteLine("Before: {0}", tmp.Value);

    tmp = helper.item;
    tmp.AssignValue(30);

    tmp = helper.item;
    Console.WriteLine("After: {0}", tmp.Value);
}

Now of course each time we copy the variable out of helper we get a fresh copy of the struct. This should normally be fine - the iteration variable is read-only, so we'd expect it not to change. However, you have a method which changes the contents of the struct, causing the unexpected behaviour.

Note that if you tried to change the property, you'd get a compile-time error:

Test.cs(37,13): error CS1654: Cannot modify members of 'item' because it is a
    'foreach iteration variable'

Lessons:

  • Mutable structs are evil
  • Structs which are mutated by methods are doubly evil
  • Mutating a struct via a method call on an iteration variable which has been captured is triply evil to the extent of breakage

It's not 100% clear to me whether the C# compiler is behaving as per the spec here. I suspect it is. Even if it's not, I wouldn't want to suggest the team should put any effort into fixing it. Code like this is just begging to be broken in subtle ways.

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • 3
    And if we add interfaces to the mix, nasal demons! – leppie Nov 28 '12 at 17:44
  • 4
    @leppie: The trouble is, part of me immediately starts wondering how I can abuse this to best effect... – Jon Skeet Nov 28 '12 at 17:47
  • Thanks. That makes sense. It took me a while to realize that the programmer that did this was modifying a mutable struct. When I added in the LINQ it just stopped working. Really confusing. – user1097991 Nov 28 '12 at 20:07
  • @user1097991: As per my comment on the question, there's no LINQ involved here. There's a lambda expression, but that's not LINQ. – Jon Skeet Nov 28 '12 at 20:38
  • @JonSkeet: I would say that issue not in lambda expression per se but in different treatment of the loop variable based on loop body (actually in treatment of what read-only local variable means). – Sergey Teplyakov Nov 29 '12 at 11:15
  • @SergeyTeplyakov: Well, based on whether the loop body contains an anonymous function (lambda expression or anonymous method) which captures the iteration variable. – Jon Skeet Nov 29 '12 at 11:41
  • If one changes changes the signature of `AssignId` to `public static void AssignID(ref RawData it, int newID)`, what happens? If `item` is supposed to be read-only, I would expect that using the alternate function (passing `item` by `ref`) would cause the compiler to correctly squawk in both cases. – supercat Nov 29 '12 at 17:52
  • @supercat: Yes, that fails to compile. – Jon Skeet Nov 29 '12 at 17:54
  • @JonSkeet: I wish there was a way to indicate whether or not particular methods or properties should be usable on struct instances that cannot legally be passed as `ref`, since 99% of "mutable struct problems" occur with code which would cause a compiler squawk if such feature existed, and because there are times when it would be useful to allow property setters to be invoked on certain "read-only" structures [e.g. it would be useful for `ArraySegment` to allow read-write indexed property access to the array, but many languages won't allow any property setters to be used on read-only structs]. – supercat Nov 29 '12 at 18:05
  • This is clearly a bug. If it's in the spec, then it's a spec bug, but still a bug. – luiscubal Dec 08 '12 at 16:54
  • 4
    @luiscubal: It would at least be a design flaw. In fact it *is* a compiler bug (confirmed). – Jon Skeet Dec 09 '12 at 09:55
  • Fascinating! . It made a good episode on TekPub TV, too. Just for fun, I rewrote Jon's example as an MSpec test. http://www.tigranetworks.co.uk/blogs/electricdreams/the-evil-of-mutable-structs/ – Tim Long Dec 24 '12 at 12:07
4

Ok. We definitely have an issues here but I suspect that this issue not with closures per se but with foreach implementation instead.

C# 4.0 specification stated (8.8.4 The foreach statement) that "the iteration variable corresponds to a read-only local variable with a scope that extends over the embedded statement". That's why we can't change loop variable or increment it's property (as Jon already stated):

struct Mutable
{
    public int X {get; set;}
    public void ChangeX(int x) { X = x; }
}

var mutables = new List<Mutable>{new Mutable{ X = 1 }};
foreach(var item in mutables)
{
  // Illegal!
  item = new Mutable(); 

  // Illegal as well!
  item.X++;
}

In this regard read-only loop variables behave almost exactly the same as any readonly field (in terms of accessing this variable outside of the constructor):

  • We can't change readonly field outside of the constructor
  • We can't change property of the read-only field of value type
  • We're treating readonly fields as values that leads to using a temporary copy every time we accessing readonly field of value type.

.

class MutableReadonly
{
  public readonly Mutable M = new Mutable {X = 1};
}

// Somewhere in the code
var mr = new MutableReadonly();

// Illegal!
mr.M = new Mutable();

// Illegal as well!
mr.M.X++;

// Legal but lead to undesired behavior
// becaues mr.M.X remains unchanged!
mr.M.ChangeX(10);

There is a plenty of issues related to mutable value types and one of them related to the last behavior: changing readonly struct via mutator method (like ChangeX) lead to obscure behavior because we'll modify a copy but not an readonly object itself:

mr.M.ChangeX(10);

Is equivalent to:

var tmp = mr.M;
tmp.ChangeX(10);

If loop variable treated by the C# compiler as a read-only local variable, than its seems reasonable to expect the same behavior for them as for read-only fields.

Right now loop variable in the simple loop (without any closures) behaves almost the same as a read-only field except copying it for every access. But if code changes and closure comes to play, loop variable starts behaving like pure read-only variable:

var mutables = new List<Mutable> { new Mutable { X = 1 } };

foreach (var m in mutables)
{
    Console.WriteLine("Before change: {0}", m.X); // X = 1

    // We'll change loop variable directly without temporary variable
    m.ChangeX(10);

    Console.WriteLine("After change: {0}", m.X); // X = 10
}

foreach (var m in mutables)
{
    // We start treating m as a pure read-only variable!
    Action a = () => Console.WriteLine(m.X));

    Console.WriteLine("Before change: {0}", m.X); // X = 1

    // We'll change a COPY instead of a m variable!
    m.ChangeX(10);

    Console.WriteLine("After change: {0}", m.X); // X = 1
}

Unfortunately I can't find strict rules how read-only local variables should behave but its clear that this behavior is different based on loop body: we're not copying to locals for every access in simple loop, but we DO this if the loop body closes over loop variable.

We all know that Closing over loop variable considered harmful and that loop implementation was changed in the C# 5.0. Simple way to fix that old issue in pre C# 5.0 era was introducing local variable, but interesting that introducing local variable in this our case will change behavior as well:

foreach (var mLoop in mutables)
{
    // Introducing local variable!
    var m = mLoop;

    // We're capturing local variable instead of loop variable
    Action a = () => Console.WriteLine(m.X));

    Console.WriteLine("Before change: {0}", m.X); // X = 1

    // We'll roll back this behavior and will change
    // value type directly in the closure without making a copy!
    m.ChangeX(10); // X = 10 !!

    Console.WriteLine("After change: {0}", m.X); // X = 1
}

Actually this means that C# 5.0 has very subtle breaking change because no one will introduce a local variable any more (and even tools like ReSharper stops warning about it in VS2012 because its not an issue).

I'm OK with both behaviors but inconsistency seems strange.

Sergey Teplyakov
  • 11,477
  • 34
  • 49
  • I wonder if there would be any difficulty with Microsoft adding a attribute that struct methods and properties could use to indicate whether they modify the underlying structure, so that compilers could forbid the calling of `this`-mutating method and properties on read-only structures but permit it on properties (including setters) that do not modify `this` [an example of the latter would be a version of `ArraySegment` that allowed read-write indexed access to the underlying array]. – supercat Nov 29 '12 at 17:55
  • Does the change in C# 5.0 changes the behaviour described in the original question? – Andrew Savinykh Nov 29 '12 at 22:16
  • No, it's not. But it can hide it even deeper, because in c# 5 no one will introduce local variable for capturing loop variable. – Sergey Teplyakov Nov 29 '12 at 22:18
1

I suspect this has to do with how lambda expressions are evaluated. See this question and its answer for more details.

Question:

When using lambda expressions or anonymous methods in C#, we have to be wary of the access to modified closure pitfall. For example:

foreach (var s in strings)
{
   query = query.Where(i => i.Prop == s); // access to modified closure

Due to the modified closure, the above code will cause all of the Where clauses on the query to be based on the final value of s.

Answer:

This is one of the worst "gotchas" in C#, and we are going to take the breaking change to fix it. In C# 5 the foreach loop variable will be logically inside the body of the loop, and therefore closures will get a fresh copy every time.

Community
  • 1
  • 1
Bobson
  • 13,498
  • 5
  • 55
  • 80
  • 1
    I think it *is* to do with the lambda expression, but not in quite the way that you're expecting. It's *not* fixed in C# 5 though - I'm using the C# 5 compiler now, and still seeing the same behaviour. – Jon Skeet Nov 28 '12 at 17:14
  • @JonSkeet - That wouldn't surprise me. Nothing I've seen on closure before has specifically addressed structs, so I took an educated guess. I didn't actually test it at all. – Bobson Nov 28 '12 at 17:18
1

Just to accomplish Sergey's post, I wanna to add following example with manual closure, that demonstrates compiler's behavior. Of course compiler might have any other implementation that satisfies readonly requirement of captured within foreach statement variable.

static void Main()
{
    var list = new List<MutableStruct>()
    {
        new MutableStruct { Value = 10 }
    };

    foreach (MutableStruct item in list)
    {
       var c = new Closure(item);

       Console.WriteLine(c.Item.Value);
       Console.WriteLine("Before: {0}", c.Item.Value);  // 10
       c.Item.AssignValue(30);
       Console.WriteLine("After: {0}", c.Item.Value);   // Still 10!
    }
}

class Closure
{
    public Closure(MutableStruct item){
    Item = item;
}
    //readonly modifier is mandatory
    public readonly MutableStruct Item;
    public void Foo()
    {
        Console.WriteLine(Item.Value);
    }
}  
0

This might solve your issue. It swaps out foreach for a for and makes the struct immutable.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace WeirdnessExample
{
    public struct RawData
    {
        private readonly int id;

        public int ID
        {
            get{ return id;}
        }

        public RawData(int newID)
        {
            id = newID;
        }
    }

    public class ProcessedData
    {
        private readonly int id;

        public int ID
        {
            get{ return id;}
        }

        public ProcessedData(int newID)
        {
            id = newID;
        }
    }

    class Program
    {
        static void Main(string[] args)
        {
            List<ProcessedData> processedRecords = new List<ProcessedData>();
            processedRecords.Add(new ProcessedData(1));


            List<RawData> rawRecords = new List<RawData>();
            rawRecords.Add(new RawData(2));


            for (int i = 0; i < rawRecords.Count; i++)
            {
                RawData rawRec = rawRecords[i];
                int id = rawRec.ID;
                if (i < 0 || i > 20)
                {
                    RawData rawRec2 = rawRec;
                    List<ProcessedData> matchingRecs = processedRecords.FindAll(mr => mr.ID == rawRec2.ID);
                }

                Console.Write(String.Format("With LINQ: ID Before Assignment = {0}, ", rawRec.ID)); //2
                rawRec = new RawData(rawRec.ID + 8);
                Console.WriteLine(String.Format("ID After Assignment = {0}", rawRec.ID)); //2
                i++;
            }

            rawRecords = new List<RawData>();
            rawRecords.Add(new RawData(2));

            for (int i = 0; i < rawRecords.Count; i++)
            {
                RawData rawRec = rawRecords[i];
                int id = rawRec.ID;
                if (i < 0)
                {
                    List<ProcessedData> matchingRecs = processedRecords.FindAll(mr => mr.ID == id);
                }
                Console.Write(String.Format("With LINQ: ID Before Assignment = {0}, ", rawRec.ID)); //2
                rawRec = new RawData(rawRec.ID + 8);
                Console.WriteLine(String.Format("ID After Assignment = {0}", rawRec.ID)); //10
                i++;
            }

            Console.ReadLine();
        }
    }
}
Jesse C. Slicer
  • 19,901
  • 3
  • 68
  • 87