4

I'm working on a streaming twitter client- after 1-2 days of constant running, I'm getting memory usage of >1.4gigs (32 bit process) and soon after it hits that amount, I'll get an out of memory exception on code that's essentially this (this code will error in <30 seconds on my machine):

while (true)
{
  Task.Factory.StartNew(() =>
  {
    dynamic dyn2 = new ExpandoObject();

    //get a ton of text, make the string random 
    //enough to be be interned, for the most part
    dyn2.text = Get500kOfText() + Get500kOfText() + DateTime.Now.ToString() + 
      DateTime.Now.Millisecond.ToString(); 
  });
}

I've profiled it and it's definitely due to class way down in the DLR (from memory- I don't have my detailed info here) xxRuntimeBinderxx and xxAggregatexx.

This answer from Eric Lippert (microsoft) seems to indicate that I'm making expression parsing objects behind the scenes that don't ever get GC'd even though no reference is kept to anything in my code.

If that's the case, is there some way in the code above to either prevent it or lessen it?

My fallback is to eliminate the dynamic usage, but I'd prefer not to.

Thanks

Update:

12/14/12:

THE ANSWER:

The way to get this particular example to free up its tasks was to yield (Thread.Sleep(0)), which would then allow the GC to handle the freed up tasks. I'm guessing a message/event loop wasn't being allowed to process in this particular case.

In the actual code I was using (TPL Dataflow), I was not calling Complete() on the blocks because they were meant to be a never-ending dataflow- the task would take Twitter messages as long as twitter would send them. In this model, there was never any reason to tell any of the blocks that they were done because they'd never BE done as long as the app was running.

Unfortunately, it doesn't look like Dataflow blocks were never designed to be very long running or handle untold numbers of items because they actually keep a reference to everything that's sent into them. If I'm wrong, please let me know.

So the workaround is to periodically (based on your memory usage- mine was every 100k twitter messages) free the blocks and set them up again.

Under this scheme, my memory consumption never goes over 80megs and after recycling the blocks and forcing GC for good measure, the gen2 heap goes back down to 6megs and everything's fine again.

10/17/12:

  • "This isn't doing anything useful": This example is merely to allow you to generate the problem quickly. It's boiled down from a few hundred lines of code that have nothing to do with the issue.
  • "An infinite loop creating a task and in turn creates objects": Remember- this merely demonstrates the issue quickly- the actual code is sitting there waiting for more streaming data. Also- looking at the code- all of the objects are created inside the Action<> lambda in the task. Why isn't this being cleaned up (eventually) after it goes out of scope? The issue also isn't due to doing it too quickly- the actual code requires more than a day to arrive at the out of memory exception- this just makes it quick enough to try things out.
  • "Are tasks guaranteed to be freed?" An object's an object, isn't it? My understanding is that the scheduler's just using threads in a pool and the lambda that it's executing is going to be thrown away after it's done running regardless.
Community
  • 1
  • 1
dethSwatch
  • 1,152
  • 1
  • 10
  • 18
  • I don't see this doing anything productive; it seems to be designed to just break the system. Given that; why are you surprised that it's breaking the system. If you have a *real* task that you need to do, why not tell us what that real task is? For example, a pragmatic solution would be to limit the number of tasks you create so that you're not consuming so much memory. – Servy Oct 17 '12 at 17:58
  • I'm not sure what you were expecting. An infinite loop creating a task and in turn creates objects. Objects consume memory... – Lews Therin Oct 17 '12 at 17:59
  • @LewsTherin Well, in theory the tasks could finish and go out of scope, thus freeing up memory for new tasks. In practice, this is likely happening but it's just happening slower than new memory is being consumed. – Servy Oct 17 '12 at 18:00
  • @Servy Are tasks guaranteed to be freed? Regardless of GC? – Lews Therin Oct 17 '12 at 18:02
  • @LewsTherin Not regardless of the GC. At some point this method clearly finishes (the task doesn't loop forever). When that method finishes the delegate the task is running has left scope, thus all variables local to that method are eligible for garbage collection (excluding things like closed over variables, or other hoisted variables). As for the task itself, it won't consume a huge amount of memory, but I would imagine that it becomes eligible for garbage collection shortly after the delegate finishes executing, given that it's not stored in any variable. – Servy Oct 17 '12 at 18:04
  • Right thanks for the lesson :) – Lews Therin Oct 17 '12 at 18:06
  • ""An infinite loop creating a task and in turn creates objects":" your counter-point does not hold: You might be generating (enqueueing) task faster than they can be processed. This, of course, leaks memory because the freeing is slower than the allocating. Why can't you reproduce this with a singly-threaded loop or a constant amount of threads? – usr Oct 17 '12 at 18:42
  • I've put sleeps in this particular code ranging for half a second to a full second and it doesn't seem to make a difference (have you tried the code?)-- also keep in mind that this requires more than a day to happen in the production code where it's not egregiously generating threads as quickly as possible- as this code does to quickly show the error – dethSwatch Oct 17 '12 at 18:45
  • 1
    I ran this code, the process' commit size varied wildly going as high as 1.5 GB. But never explodes. The diagnostic is that starting a lot of threads that use a lot of memory begets a process that uses a lot of memory. If you have enough cores then that can certainly cause OOM. A 64-bit version of Windows is a cheap and simple fix. – Hans Passant Oct 17 '12 at 18:45
  • I think there is something to what the OP says. He does not create infinite tasks in production yet he has a problem. I just think he needs to provide a repro that does not have a different obvious bug than the one he wants to show. – usr Oct 17 '12 at 18:55

2 Answers2

3

This has more to do with the producer running far ahead of the consumer, than the DLR. The loop creates tasks as fast as possible - and the tasks aren't started as "immediately". It's easy to figure out just how much it might lag behind:

        int count = 0;

        new Timer(_ => Console.WriteLine(count), 0, 0, 500);

        while (true)
        {
            Interlocked.Increment(ref count);

            Task.Factory.StartNew(() =>
            {
                dynamic dyn2 = new ExpandoObject();
                dyn2.text = Get500kOfText() + Get500kOfText() + DateTime.Now.ToString() +
                  DateTime.Now.Millisecond.ToString();

                Interlocked.Decrement(ref count);
            });
        }

Output:

324080
751802
1074713
1620403
1997559
2431238

That's a lot for 3 seconds' worth of scheduling. Removing the Task.Factory.StartNew (single threaded execution) yields stable memory.

The repro you've given seems a bit contrived, though. If too many concurrent tasks is indeed your problem, you could try for a custom task scheduler that limits concurrent scheduling.

Asti
  • 12,447
  • 29
  • 38
  • right, but remember- the production code is at the speed of twitter, which in my usage is max 10 messages/second followed by several seconds of inactivity. Also- putting strategic sleeps in this example will still generate the out of memory exception- it'll just take longer. – dethSwatch Oct 17 '12 at 19:54
1

The problem here is not that the tasks you are creating aren't being cleaned up. Asti has demonstrated that your code is creating tasks faster than they can be processed, so while you are clearing up the memory of completed tasks, you still run out eventually.

You have said:

putting strategic sleeps in this example will still generate the out of memory exception- it'll just take longer

You haven't shown the code for this, or any other example that bounds the number of concurrent tasks. My guess is that you are limiting the creation to some degree, but that the rate of creation is still faster than the rate of consumption. Here is my own bounded example:

int numConcurrentActions = 100000;
BlockingCollection<Task> tasks = new BlockingCollection<Task>();

Action someAction = () =>
{
    dynamic dyn = new System.Dynamic.ExpandoObject();

    dyn.text = Get500kOfText() + Get500kOfText() 
        + DateTime.Now.ToString() + DateTime.Now.Millisecond.ToString();
};

//add a fixed number of tasks
for (int i = 0; i < numConcurrentActions; i++)
{
    tasks.Add(new Task(someAction));
}

//take a task out, set a continuation to add a new one when it finishes, 
//and then start the task.
foreach (Task t in tasks.GetConsumingEnumerable())
{
    t.ContinueWith(_ =>
    {
        tasks.Add(new Task(someAction));
    });
    t.Start();
}

This code will ensure that no more than 100,000 tasks will be running at any one time. When I run this the memory is stable (when averaged over a number of seconds). It bounds the tasks by creating a fixed number, and then setting a continuation to schedule a new task whenever an existing one finishes.

So what this tells us is that since your real data is based on a feed from some external source you are getting data from that feed ever so slightly faster than you can process it. You have a few options here. You could queue items as they come in, ensure that only a limited number can currently be running, and throw out requests if you have exceeded your capacity (or find some other way of filtering the input so that you don't process it all), or you could just get better hardware (or optimize the processing method that you have) so that you are able to process the requests faster than they can be created.

While normally I would say that people tend to try to optimize code when it already runs "fast enough", this clearly isn't the case for you. You have a fairly hard benchmark that you need to hit; you need to process items faster than they come in. Currently you aren't meeting that benchmark (but since it runs for a while before failing, you shouldn't be that far off).

Community
  • 1
  • 1
Servy
  • 202,030
  • 26
  • 332
  • 449
  • I'll test your hypothesis that it's truly producing more than can be consumed- I do actually queue every message (I'm using the TPL Dataflow BatchBlock of 10) before acting on them, and the hardware is Core i7 4 core+12gb so that's not the issue. -- But my main concern is that there are frequently several seconds of inactivity (no one's tweeting) and there's rather little chance (I think) that it can't catch up in those periods. – dethSwatch Oct 18 '12 at 14:17