21

See the following concurrent performance analysis representing the work done by a parallel foreach:

enter image description here

Inside the loop each thread reads data from the DB and process it. There are no locks between threads as each one process different data.

Looks like there are periodic locks in all the thread of the foreach due to unknown reasons (see the black vertical rectangles). If you see the selected locked segment (the dark red one) you will see that the stack shows the thread locked at StockModel.Quotation constructor. The code there just constructs two empty lists!

I've read somewhere that this could be caused by the GC so I've changed the garbage collection to run in server mode with:

<runtime>
    <gcServer enabled="true"/>
</runtime>

I got a small improvement (about 10% - 15% faster) but I still have the vertical locks everywhere.

I've also added to all the DB queries the WITH(NOLOCK) as I'm only reading data without any difference.

Any hint on what's happening here?

The computer where the analysis has been done has 8 cores.

EDIT: After enabling Microsoft Symbol servers turns out that all threads are blocked on calls like wait_gor_gc_done or WaitUntilGCComplete. I thought that enabling GCServer I had one GC for each thread so I would avoid the "vertical" lock but seems that it's not the case. Am I wrong?

Second question: as the machine is not under memory pressure (5 of 8 gigs are used) is there a way to delay the GC execution or to pause it until the parallel foreach ends (or to configure it to fire less often)?

Ignacio Soler Garcia
  • 21,122
  • 31
  • 128
  • 207
  • In case you're allocating alot of objects and the locks are indeed caused by the GC, did you try to force a GC.Collect just before starting the TPL work? GC.Collect with GCCollectionMode.Forced. – Alex Feb 19 '13 at 16:25
  • Well, inside the loop I'm allocating a big quantity of small objects that are 'abandoned' at the end of each iteration. Could this lock the whole set of threads if they are GC'ed? – Ignacio Soler Garcia Feb 19 '13 at 16:33
  • 5
    Enable the Microsoft Symbol Server to get better stack traces. Given the long wait, this just looks like plain garbage collections. – Hans Passant Feb 19 '13 at 16:59
  • 1
    @SoMoS You can subscribe to GC events or use PerfMon.exe to see generational collects. If the hickups are an issue for your app, you could try lessen the GC burden and allocate your small objects on the Stack (Convert them to structs). Marc Gravell has an excellent blog post about it: http://marcgravell.blogspot.ch/2011/10/assault-by-gc.html – Alex Feb 19 '13 at 17:48
  • @HansPassant: I will, thanks. And I'll post a new callstack image. – Ignacio Soler Garcia Feb 19 '13 at 18:55
  • @Alex: it's not really an issue but progress is not something constant, it's bumpy, it goes in fits and starts and I think the process would be much faster if I could delay whatever locks the threads (the GC) until the end of the process. – Ignacio Soler Garcia Feb 19 '13 at 18:58
  • It might just be a lock of a third party component (f.ex. a lock in the .NET framework). In the past I troubleshooted issues like this by using ANTS profiler (wall clock time). – atlaste Feb 19 '13 at 19:13
  • @HansPassant, you where right. Can you check my edit? Thanks. – Ignacio Soler Garcia Feb 19 '13 at 21:00
  • Also note, that the machine memory used is not related to garbage collection. You need to look at your process memory. If it's a 32bit app you only have a 2 Gigs of virtual address space and fragmentation plays a large role. See Fundamentals of garbage collection here: http://msdn.microsoft.com/en-us/library/ee787088.aspx – Kim Feb 20 '13 at 15:17
  • @Kim: thanks. The application is 64 bits due to high memory requirements so this shouldn't be a problem (right?) Also, I still have the mystery about why all the threads stop if the GC is set to server mode. – Ignacio Soler Garcia Feb 20 '13 at 21:24
  • 1
    If you look through the link about fundamentals of GC above it has nice diagrams of the GC thread and other threads being suspended depending on the server vs. workstation mode. With 64 bit, you're still limited to 4Gb virtual memory. – Kim Feb 20 '13 at 21:47
  • 4Gb? Then it makes sense, I'm reaching 4Gb with ease ... – Ignacio Soler Garcia Feb 21 '13 at 09:26
  • 1
    @Kim Why should you be limited to 4 GB on x64 architectures? It is actually 8TB (Theoretically) but limited by your hardware and page file Settings. The CLR however cannot create bigger objects than 2 GB. – Alex Feb 22 '13 at 23:05
  • @Alex- eech. I'm not sure where I got that from now. Maybe I was looking at 32-bit process with IMAGE_FILE_LARGE_ADDRESS_AWARE set? Thanks for pointing out the mistake. – Kim Feb 25 '13 at 15:11
  • Have you tried at SustainedLowLatency mode in .net framework 4.5? http://blogs.msdn.com/b/dotnet/archive/2012/07/20/the-net-framework-4-5-includes-new-garbage-collector-enhancements-for-client-and-server-apps.aspx – Sergey Zyuzin Feb 25 '13 at 18:37
  • What size of object do you allocate? Are there objects larger than 85kb? Can you collect GC related perfcounters and display them?(like generation size growth, LOH size over time etc.) – Sergey Zyuzin Feb 25 '13 at 18:39
  • @Sergey: there are lots of small objects so probably there are no objects bigger than 85Kb. I'll try to collect perfcounters and paste them here. Thanks. – Ignacio Soler Garcia Feb 25 '13 at 22:02
  • @Alex - the memory limit of an x64 is more like about 18 exabytes which is 1.8 times 10 to the power of 19! Still a little bit more than 8TB. – Hendrik Wiese Feb 26 '13 at 09:10
  • @SeveQ The theoretical max. is 16 exabytes (2^64), the process address space is limited to 8TB. – Alex Feb 26 '13 at 09:22
  • @Alex Alright, thanks for the update. But that's a soft limit defined by the OS, isn't it? – Hendrik Wiese Feb 26 '13 at 09:24
  • @SeveQ At least for Windows yes ( http://blogs.technet.com/b/markrussinovich/archive/2008/11/17/3155406.aspx ). No idea how Unix / Linux handles it. – Alex Feb 26 '13 at 09:26
  • Well, I guess MS will raise that limit before it's even getting in sight... – Hendrik Wiese Feb 26 '13 at 09:51

2 Answers2

4

If your StockModel.Quotation class allows for it, you could create a pool to limit the number of new objects created. This is a technique they sometimes use in games to prevent the garbage collector stalling in the middle of renders.

Here's a basic pool implementation:

    class StockQuotationPool
    {

        private List<StockQuotation> poolItems;
        private volatile int itemsInPool;

        public StockQuotationPool(int poolSize)
        {
            this.poolItems = new List<StockQuotation>(poolSize);
            this.itemsInPool = poolSize;

        }

        public StockQuotation Create(string name, decimal value)
        {
            if (this.itemsInPool == 0)
            {
                // Block until new item ready - maybe use semaphore.
                throw new NotImplementedException();
            }

            // Items are in the pool, but no items have been created.
            if (this.poolItems.Count == 0)
            {
                this.itemsInPool--;
                return new StockQuotation(name, value);
            }

            // else, return one in the pool
            this.itemsInPool--;

            var item = this.poolItems[0];
            this.poolItems.Remove(item);

            item.Name = name;
            item.Value = value;

            return item;
        }

        public void Release(StockQuotation quote)
        {
            if (!this.poolItems.Contains(quote)
            {
                this.poolItems.Add(quote);
                this.itemsInPool++;
            }
        }

    } 

That's assuming that the StockQuotation looks something like this:

  class StockQuotation
    {
        internal StockQuotation(string name, decimal value)
        {
            this.Name = name;
            this.Value = value;
        }


        public string Name { get; set; }
        public decimal Value { get; set; }
    }

Then instead of calling the new StockQuotation() constructor, you ask the pool for a new instance. The pool returns an existing instance (you can precreate them if you want) and sets all the properties so that it looks like a new instance. You may need to play around until you find a pool size that is large enough to accommodate the threads at the same time.

Here's how you'd call it from the thread.

    // Get the pool, maybe from a singleton.
    var pool = new StockQuotationPool(100);


    var quote = pool.Create("test", 1.00m);


    try
    {
        // Work with quote

    }
    finally
    {
        pool.Release(quote);
    }

Lastly, this class isn't thread safe at the moment. Let me know if you need any help with making it so.

Mike the Tike
  • 1,136
  • 8
  • 9
  • Nice idea, I'll think on this. The problem is that as this was not previously considered I'm not sure about the logic required to know when the release can be called. – Ignacio Soler Garcia Feb 26 '13 at 15:35
  • I've updated the code to include a try/finally. You don't want to lose objects when exceptions occur. – Mike the Tike Feb 26 '13 at 15:40
  • In terms of when to call pool.Release(), I'd say basically just before it goes out of scope if it's in a method. If its stored in a field on an object, then you can make that object Disposable and call Release() in the Dispose method. **The sooner you can release it, the better** and the smaller your pool will need to be. – Mike the Tike Feb 26 '13 at 15:44
  • Yeah, I thought about implementing the IDisposable but the same object is an instance method of several classes and is included into several Lists. Currently determining when an instance is not needed elsewhere and can be disposed is not straightforward. – Ignacio Soler Garcia Feb 27 '13 at 12:05
0

You could try using GCLatencyMode.LowLatency; See related question here: Prevent .NET Garbage collection for short period of time

I recently attempted this with no luck. Garbage collection was still being called when caching bitmap images of Icon sizes on a form I was displaying. What worked for me was using Ants performance profiler and Reflector to find the exact calls that were causing the GC.Collect and work around it.

Community
  • 1
  • 1
Kim
  • 1,068
  • 13
  • 25
  • Nothing strange here. I'm allocating big amounts of objects for a short time period but the machine has a lot of RAM so I would prefer to "stop" for a while the GC if I could do it. – Ignacio Soler Garcia Feb 19 '13 at 21:36
  • Do you have allocations occurring inside a loop? Can they be moved outside? see http://stackoverflow.com/questions/3412003/allocating-memory-inside-loop-vs-outside-loop – Kim Feb 20 '13 at 15:07
  • Mmmm, each loop reads its own data from a db, process it and generates a result that it's stored for further processing. After that the generated data is discarded. I don't see a way to avoid that ... – Ignacio Soler Garcia Feb 20 '13 at 21:27