1

Follow up from:

How do I iterate through a dynamically valued dictionary in parallel?

If I run a function using a locally scoped variable, my program eventually just throws away the value - which has left me utterly confused.

Why is this happening?

Simple Example:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Data;

namespace ConsoleApplication
{
    class Program
    {
        static void Main(string[] args)
        {
            Dictionary<string, object> Dict = new Dictionary<string, object>() {
                { "1",  new Dictionary<string, object>() { 
                    {"Columns",new object[1]{ "name"} }
                }},
                 { "2",  new Dictionary<string, object>() { 
                    {"Columns",new object[1]{ "name"} }
                }}

            };
            int i = 0;
            foreach (KeyValuePair<string, object> record in Dict)
            {

                var _recordStored = record; //THIS IS(was) SUPPOSED TO FIX MY WOES!!!
                new System.Threading.Timer(_ => {
                    i++;
                    //if i comment the next line, the code "works" ( albeit doing nothing really" )
                    processFile((Dictionary<string, object>)_recordStored.Value, new List<object>{});
                    Console.WriteLine("Val: " + _recordStored.Key + " ThreadID: " + System.Threading.Thread.CurrentThread.ManagedThreadId+ " "+ i);
                }, null, TimeSpan.Zero, new TimeSpan(0, 0,0,0,1));
            }

            for (; ; )
            {
                System.Threading.Thread.Sleep(10000);
            }
        }

        public static void processFile(Dictionary<string, dynamic> record, List<object> fileData)
        {

            DataTable dt = new DataTable("table");
            foreach (string column in record["Columns"])
            {
                dt.Columns.Add(column, typeof(string));
            }
        }
    }
}

Which has the output:

Val: 1 ThreadID: 12
Val: 2 ThreadID: 11
Val: 1 ThreadID: 11
Val: 2 ThreadID: 12
Val: 2 ThreadID: 12
Val: 1 ThreadID: 11
Val: 2 ThreadID: 12

But will eventually ( after ~880) iterations just start printing

Val: 2 ThreadID: 10
Val: 2 ThreadID: 12
Val: 2 ThreadID: 10
Val: 2 ThreadID: 12
Val: 2 ThreadID: 10
Val: 2 ThreadID: 12

The wierdest thing about all this is, that when I remove the call to processFile the code will always execute perfectly.

Community
  • 1
  • 1
Alex
  • 5,674
  • 7
  • 42
  • 65
  • Your code doesn't compile... "The name 'i' does not exist in the current context" – Jon Skeet Jul 15 '13 at 19:38
  • @JonSkeet Oops. Fixed the code not compiling. – Alex Jul 15 '13 at 19:41
  • Now the code stops after a while for me, when the timers are collected. You're not keeping them from being garbage collected. After fixing that, I'm not seeing the problem you are... – Jon Skeet Jul 15 '13 at 19:42
  • You might want to use a conncurentDictionary for this. – webdad3 Jul 15 '13 at 19:42
  • What u have fixed Alex.Update your question with latest one – Kamran Shahid Jul 15 '13 at 19:45
  • 1
    Agree with Jon... From MSN :) As long as you are using a Timer, you must keep a reference to it. As with any managed object, a Timer is subject to garbage collection when there are no references to it. The fact that a Timer is still active does not prevent it from being collected. – Bogdan_Ch Jul 15 '13 at 19:46

1 Answers1

6

This is garbage collection at work, correctly doing its job mind you.

The problem is related to your timers.

Before reading further, note that your trick to use a locally scoped variable was entirely correct. This was just not the problem you were having. So continue doing that, if you hadn't done that you would have had other/more problems.

Ok, so, you construct two timers, since you have two values in your dictionary, and they run "forever", that is until they're collected.

You're not keeping the references to the timers, so eventually the garbage collector will collect one or both of them.

When it does, it'll run the finalizer of the timer before it collects it, and this will stop its execution.

The solution is to hold on to your timers, or to not use timers at all, but let's stay with the timers, the solution is easy to fix:

int i = 0;
var timers = new List<System.Threading.Timer>();
foreach (KeyValuePair<string, object> record in Dict)
{
    var _recordStored = record; //THIS IS(was) SUPPOSED TO FIX MY WOES!!!
    timers.Add(new System.Threading.Timer(_ => {
       ... rest of your timer code here
    }, null, TimeSpan.Zero, new TimeSpan(0, 0,0,0,1))); // extra end parenthesis here
}
for (; ; )
{
    System.Threading.Thread.Sleep(10000);
}
GC.KeepAlive(timers); // prevent the list from being collected which would ...
                      // end up making the timers eligible for collection (again)

Now: Here's a snafu that will trip you up. If you're running this in the debugger, or in a DEBUG build, the fact that it's the first value that disappears is completely normal, at the same time that it is completely normal for the second value to never disappear.

Why? Because all variables have had their lifetime extended to the duration of the method. In the loop, when you say simply new System.Threading.Timer(...), the compiler silently redefines this as this:

var temp = `new System.Threading.Timer(...)`

This variable (temp) is overwritten on each loop iteration, in effect keeping the last value assigned to it, and this timer will never be collected as long as you are running this in the debugger or in a DEBUG build, since the method that defined it never finishes (the endless loop at the bottom).

The first timer, however, is free to be collected since there is no longer anything keeping a reference to it.

You can verify this with this LINQPad program:

void Main()
{
    for (int index = 1; index <= 5; index++)
    {
        new NoisyObject(index.ToString());
    }
    for (;;)
    {
        // do something that will provoke the garbage collector
        AllocateSomeMemory();
    }
}

private static void AllocateSomeMemory()
{
    GC.KeepAlive(new byte[1024]);
}

public class NoisyObject
{
    private readonly string _Name;

    public NoisyObject(string name)
    {
        _Name = name;
        Debug.WriteLine(name + " constructed");
    }

    ~NoisyObject()
    {
        Debug.WriteLine(_Name + " finalized");
    }
}

When you run this in LINQPad, make sure the little button down to the right in the window is switched to "/o-":

LINQPad optimization switch

You'll get this output:

1 constructed
2 constructed
3 constructed
4 constructed
5 constructed
4 finalized
3 finalized
2 finalized
1 finalized

Notice how the 5 is never finalized?

Now turn on optimizations by clicking that "/o-" button and turn it to "/o+" to enable optimizations (RELEASE build):

1 constructed
2 constructed
3 constructed
4 constructed
5 constructed
5 finalized
4 finalized
3 finalized
2 finalized
1 finalized

And now 5 is also finalized.

Note: What I said about about a temporary variable being introduced doesn't really happen just like that, but the effect is the same.

If you change the main method of the above LINQPad program to this:

void Main()
{
    new NoisyObject("Not finalized when /o-");
    for (;;)
    {
        // do something that will provoke the garbage collector
        AllocateSomeMemory();
    }
}

you can verify that running it under "/o-" will never finalized the object (well, never is such a strong word, not in the time I waited for it anyway), and it will under /"o+".

Bonus question: If you had dropped the GC.KeepAlive(timer); from the solution I presented above, wouldn't the fact that I have stored it in a variable change the behavior?

Not really. The problem is the lifetime of the variables. The compiler and JITter uses information about where a variable is used to figure out which variables are considered "live" at any point in time.

For instance this code:

void Main()
{
    var obj = new SomeObject();
    CallSomeMethodThatNeverReturns();
}

wouldn't this keep obj alive indefinitely? No. At the point where the method is called, that variable is no longer used, no code can/will access it, and thus the garbage collector is free to collect it.

This is what I meant about the lifetime of a variable. In a DEBUG build (optimizations off), or when running under the debugger, the above code looks like this:

void Main()
{
    var obj = new SomeObject();
    CallSomeMethodThatNeverReturns();
    GC.KeepAlive(obj);
}

in essence preventing the object from being collected. GC.KeepAlive is essentially a no-op method that the compiler and JITter cannot "optimize away" and will thus seem to use the object in question, thereby extending its lifetime.

This is not done in a RELEASE build (optimizations on), and so a production program might behave differently from under development.

Conclusion: Do your testing on the same type of build that your customers will get.

Lasse V. Karlsen
  • 380,855
  • 102
  • 628
  • 825
  • Note that when I say "DEBUG" and "RELEASE" builds, I really mean a build where the optimizations are turned off (DEBUG) and on (RELEASE). – Lasse V. Karlsen Jul 15 '13 at 20:04
  • The reason for the debug build and debugger extending the lifetime of variables like this is that presumably you might want to inspect variable values, even if the part where the variable was used has been exected and the debugger stopped further down in the same method. – Lasse V. Karlsen Jul 15 '13 at 20:18