0

I experience strange memory leak in computation expensive content-based image retrieval (CBIR) .NET application

The concept is that there is service class with thread loop which captures images from some source and then passes them to image tagging thread for annotation.

Image tags are queried from repository by the service class at specified time intervals and stored in its in-memory cache (Dictionary) to avoid frequent db hits.

The classes in the project are:

class Tag
{
    public Guid Id { get; set; }        // tag id
    public string Name { get; set; }    // tag name: e.g. 'sky','forest','road',...
    public byte[] Jpeg { get; set; }    // tag jpeg image patch sample
}

class IRepository
{
    public IEnumerable<Tag> FindAll();
}

class Service
{        
    private IDictionary<Guid, Tag> Cache { get; set; }  // to avoid frequent db reads
    // image capture background worker (ICBW)
    // image annotation background worker (IABW)
}

class Image
{
    public byte[] Jpeg { get; set; }
    public IEnumerable<Tag> Tags { get; set; }
}

ICBW worker captures jpeg image from some image source and passes it to IABW worker for annotation. IABW worker first tries to update Cache if time has come and then annotates the image by some algorithm creating Image object and attaching Tags to it then storing it to annotation repository.

Service cache update snippet in IABW worker is:

IEnumerable<Tag> tags = repository.FindAll();
Cache.Clear();
tags.ForEach(t => Cache.Add(t.Id, t));

IABW is called many times a second and is pretty processor extensive.

While running it for days I found memory increase in task manager. Using Perfmon to watch for Process/Private Bytes and .NET Memory/Bytes in all heaps I found them both increasing over the time.

Experimenting with the application I found that Cache update is the problem. If it is not updated there is no problem with the mem increase. But if the Cache update is as frequent as once in 1-5 minutes application gets ouf of mem pretty fast.

What might be the reason of that mem leak? Image objects are created quite often containing references to Tag objects in Cache. I presume when the Cache dictionary is created those references somehow are not garbage collected in the future.

Does it need to explicitly null managed byte[] objects to avoid memory leak e.g. by implementing Tag, Image as IDisposable?

Edit: 4 aug 2001, addition of the buggy code snippet causing quick mem leak.

static void Main(string[] args)
{
    while (!Console.KeyAvailable)
    {
        IEnumerable<byte[]> data = CreateEnumeration(100);
        PinEntries(data);
        Thread.Sleep(900);
        Console.Write(String.Format("gc mem: {0}\r", GC.GetTotalMemory(true)));
    }
}

static IEnumerable<byte[]> CreateEnumeration(int size)
{
    Random random = new Random();
    IList<byte[]> data = new List<byte[]>();
    for (int i = 0; i < size; i++)
    {
        byte[] vector = new byte[12345];
        random.NextBytes(vector);
        data.Add(vector);
    }
    return data;
}

static void PinEntries(IEnumerable<byte[]> data)
{
    var handles = data.Select(d => GCHandle.Alloc(d, GCHandleType.Pinned));
    var ptrs = handles.Select(h => h.AddrOfPinnedObject());
    IntPtr[] dataPtrs = ptrs.ToArray();
    Thread.Sleep(100); // unmanaged function call taking byte** data
    handles.ToList().ForEach(h => h.Free());
}
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Chesnokov Yuriy
  • 1,760
  • 5
  • 21
  • 35

1 Answers1

1

No, you don't need to set anything to null or dispose of anything if it's just memory as you've shown.

I suggest you get hold of a good profiler to work out where the leak is. Do you have anything non-memory-related that you might be failing to dispose of, e.g. loading a GDI+ image to get the bytes?

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • Yes, these are just plain bytes of Jpeg compressed images. I keep using {} statements for the Bitmap image created from the jpeg in IABW to annotate it. – Chesnokov Yuriy Aug 03 '11 at 13:30
  • There is no memory leak if Cache is not updated. I believe that resolves the issue about pending bitmap handles – Chesnokov Yuriy Aug 03 '11 at 13:32
  • @Chesnokov: In that case I would break out the profiler. Try to find some way to get it to leak memory really quickly, so you can make changes and see what happens without waiting 24 hours etc :) – Jon Skeet Aug 03 '11 at 13:35
  • :) there is no need to wait 24 hours if Cache update is as fast as once in 1-5 minutes. In that case you may observe mem grow in Perfmon as it runs. I thought of managed byte[] references as culprit to null them explicitly – Chesnokov Yuriy Aug 03 '11 at 13:42
  • @Chesnokov: No, if nothing else has a reference to them, it shouldn't make any odds. I suggest you update the cache once a second or so, and see if you run out of memory very quickly. Note that it's fine for memory to increase for a while, but it should stabilize. – Jon Skeet Aug 03 '11 at 13:45
  • no, it does not stabelize, getting quickly to 1Gb or so, after that OutOfMemory is cought (due to 32 bit architecture) and I terminate the thread to avoid its operation under that condition. It does not get released further also – Chesnokov Yuriy Aug 03 '11 at 14:05
  • @Chesnokov: Right, so that really does sound like a leak. If you're able to come up with a short but complete program which demonstrates the problem, we may be able to help more... otherwise it's really just a case of profiling. – Jon Skeet Aug 03 '11 at 15:17
  • Yes, it is. I hope to conjure it up in a separate console application. I tried to simulate it in a short console application but the problem did not appear. I presume I've got to take domain business logic explicetly to a separate console, hope to get it soon. It would be intresting to see if the problem was related to GC – Chesnokov Yuriy Aug 03 '11 at 17:02
  • Thank you for the suggestion to update cache every second. I crafted simple console which processed image and updated db in a loop and mem increased very quickly. I presume the suspect of mem leak may be found in http://stackoverflow.com/questions/6937933/memory-leaks-in-passing-ienumerablebyte-array-to-unmanaged-function-as-byte – Chesnokov Yuriy Aug 04 '11 at 07:55
  • It is not the unmanaged code causing the leak. I conjured console to simulate the bug, attaching it here below... – Chesnokov Yuriy Aug 04 '11 at 08:44