14

Possible Duplicate:
Possible memoryleak in ConcurrentBag?

Edit1:

The actual question is. Can you confirm this or is my sample wrong and I am missing somthing obvious?

I have thought that ConcurrentBag is simpy a replacement for an unorderd list. But I was wrong. ConcurrentBag does add itself to as ThreadLocal to the creating thread which does basically cause a memory leak.

   class Program
    {
        static void Main(string[] args)
        {
            var start = GC.GetTotalMemory(true);
            new Program().Start(args);
            Console.WriteLine("Diff: {0:N0} bytes", GC.GetTotalMemory(true) - start);
            GC.Collect();
            GC.WaitForPendingFinalizers();
            GC.Collect();
            Thread.Sleep(5000);
        }

        private void Start(string[] args)
        {
            for (int i = 0; i < 1000; i++)
            { 
                var bag = new ConcurrentBag<byte>();
                bag.Add(1);
                byte by;
                while (bag.TryTake(out by)) ;
            }
        }

I can make Diff 250 KB or 100 GB depending on how much data I add to the bags. The data nor the bags go away.

When I break into this with Windbg and I do a !DumpHeap -type Concurrent

....

000007ff00046858        1           24 System.Threading.ThreadLocal`1+GenericHolder`3[[System.Collections.Concurrent.ConcurrentBag`1+ThreadLocalList[[System.Byte, mscorlib]], System],[System.Threading.ThreadLocal`1+C0[[System.Collections.Concurrent.ConcurrentBag`1+ThreadLocalList[[System.Byte, mscorlib]], System]], mscorlib],[System.Threading.ThreadLocal`1+C0[[System.Collections.Concurrent.ConcurrentBag`1+ThreadLocalList[[System.Byte, mscorlib]], System]], mscorlib],[System.Threading.ThreadLocal`1+C0[[System.Collections.Concurrent.ConcurrentBag`1+ThreadLocalList[[System.Byte, mscorlib]], System]], mscorlib]]
000007feed812648        2           64 System.Collections.Concurrent.ConcurrentStack`1[[System.Int32, mscorlib]]
000007feece41528        1          112 System.Collections.Concurrent.CDSCollectionETWBCLProvider
000007ff000469e0     1000        32000 System.Threading.ThreadLocal`1+Boxed[[System.Collections.Concurrent.ConcurrentBag`1+ThreadLocalList[[System.Byte, mscorlib]], System]]
000007feed815900     1000        32000 System.Collections.Concurrent.ConcurrentStack`1+Node[[System.Int32, mscorlib]]
000007ff00045530     1000        72000 System.Collections.Concurrent.ConcurrentBag`1+ThreadLocalList[[System.Byte, mscorlib]]

When I do create an empty ConcurrentBag to let some worker threads add data to it ConcurrentBag and its data will be there as long as the creating thread is still alive.

This way I got a several GB memory leak. I did "fix" this by using a List and locks. ConcurrentBag may be fast but it is useless as simple replacement for a List with the same object lifetime.

If I ever create a ConcurrentBag on the main thread I will keep it as long as the thread is alive. This is not something I would expect and it can cause major pain.

Community
  • 1
  • 1
Alois Kraus
  • 13,229
  • 1
  • 38
  • 64

3 Answers3

3

You are right that ConcurrentBag creates a ThreadLocal copy, in fact they are optimized for scenarios where the same thread is reading and writing the data to the bag: "... ConcurrentBag is a thread-safe bag implementation, optimized for scenarios where the same thread will be both producing and consuming data stored in the bag."

On the otherhand, I do not see a strange behaviour here; the thread lives and the concurrent bag lives. When thread finishes GC will do it's job.

daryal
  • 14,643
  • 4
  • 38
  • 54
  • The problem is that memory seems to be kept alive until the thread exits. What if it is a long-running thread? – usr Jun 01 '12 at 13:12
  • I have thought that ConcurrentBag would be suitable with TPL. But if I cannot control the lifetime of the thread I cannot be sure when the memory is reclaimed. This is a rather limiting factor. – Alois Kraus Jun 01 '12 at 16:01
  • 1
    I will mark this as answer though I do think that local variable of ConcurrentBag should never allocate a new TLS instance without a way to release it. If you create 1 million instances of ConcurrentBag in a local method you will keep them and you have no way to call dispose on the bag because it does not implement IDisposable. – Alois Kraus Jun 04 '12 at 10:00
  • @Alois Kraus: ThreadLocal has a finalizer, so when there is no reference left on the ConcurrentBag, the thread static will be cleared. It would be preferable to have exposed a Dispose on ConcurrentBag but at least it's not leaking. – Jeff Cyr Nov 12 '12 at 15:34
  • 1
    You have checked this on .NET 4.5. For .NET 4.0 there is a leak which does stay there until the thread that holds it terminates. It has got better with .NET 4.5 but you still have no explicit control over your memory which is a pity. The data structures for a item in the bag have also lost most of its feat. I still miss an unordered lock free collection in .NET which has no leaks or finalization issues. – Alois Kraus Nov 12 '12 at 20:56
2

From the documentation

ConcurrentBag is a thread-safe bag implementation, optimized for scenarios where the same thread will be both producing and consuming data stored in the bag.

and from When to use a thread-safe collection

In mixed producer-consumer scenarios, ConcurrentBag is generally much faster and more scalable than any other concurrent collection type for both large and small workloads.

I'd say that your assumptions about ConcurrentBag are incorrect. First, it doesn't add itsels to ThreadLocal, it uses thread local storage to provide separate internal lists for each thread that access it. It is more than just a thread-safe unordered list.

What you consider a memory leak is actually the expected behavior once you realize that the bag uses TLS - there is no need to clear the data as long as the thread is in use.

Having said all that, I hadn't realized the extra functionality of ConcurrentBag myself until just now.

I've found a very good description of how ConcurrentBag uses separate lists and the cost of its methods in different scenarions in "What is ConcurrentBag". I do wish this description appeared in the MSDN documentation.

Personally, I'll start using ConcurrentBag a lot more now that I know of its special behavior.

UPDATE:

Just checked this post by Ayende saying that "ThreadLocal, which ConcurrentBag uses, didn’t expect to have a lot of instances. That has been fixed, and now can run fairly fast"

Panagiotis Kanavos
  • 120,703
  • 13
  • 188
  • 236
  • 1
    This does not address the problem: How to be able to free memory without terminating all related threads (think of thread-pool threads)? – usr Jun 01 '12 at 13:16
  • There IS no problem. That's the way the class (and TLS) works. By the way, the exact same question was asked and answered here http://stackoverflow.com/questions/5353164/possible-memoryleak-in-concurrentbag – Panagiotis Kanavos Jun 01 '12 at 13:30
  • The first Link does not work. The second link shows a quite simplified view of it. It does not address any of the issues I was confronted with. E.g. if you create a bag of ints you do not use 4 bytes per item but 32 bytes (Node class instance (20 bytes on x64), prev pointer (8 bytes x64), next pointer (8 bytes x64), T item 4 bytes). Fast concurrent access for mass data is therefore out of question with this class because the overhead is quite high. – Alois Kraus Jun 01 '12 at 16:09
  • Oops, the first link has a single full stop at the end. Fixed. As for the issues, I think you are trying to fit the class to a scenario it was never created for. A class created for efficient publisher/consumer processing will never be as memory efficient as a simple array with a lock. On the other hand, a lock-free container can scale to many more cores than a lock. In fact, TPL in general was not designed for ultra-efficient mass processing - that's what SSE2 operations are for and .NET does NOT have them. Mono does, and I sorely miss them. – Panagiotis Kanavos Jun 05 '12 at 06:40
  • And the comment to [post by Ayende](http://ayende.com/blog/156097/the-high-cost-of-concurrentbag-in-net-4-0?key=742a9bdb-d633-4779-8a43-a5393c97c75a&utm_source=feedburner&utm_medium=twitter&utm_campaign=Feed%3a%20AyendeRahien%20%28Ayende%20@%20Rahien%29) tells: "This is still not fixed with .NET 4.5" – Gennady Vanin Геннадий Ванин Mar 14 '13 at 10:24
  • @AloisKraus Have you ever heard of the LinkedList structure? You know, the one that, unlike List, you don't need to recreate each time you want to remove some element element from the middle. – Ark-kun Apr 28 '13 at 03:23
  • @Ark-kun: There are many possible implementations of a linked list. The one from .NET 4.0 was much more memory hungry than the one of .NET 4.5. So yes there is improvement but I still miss a IDisposable on that one to be sure to release my memory deterministically. Otherwise I will have to wait for the next Gen 2 collection. – Alois Kraus Apr 28 '13 at 13:01
-3

Why not move the Console.WriteLine after the second GC.Collect()? Otherwise you may be looking at some more objects than you intended.

You can also try putting everything in your Main into a loop to get a few stats. Even if you were to not move your write you probably would see smaller deltas afterwards.

Cheers!