23

I've been working with EMGU+OpenCV for quite some time and ran into this AccessViolationException mystery.

First thing first, the code:

class AVE_Simulation
    {
        public static int Width = 7500;
        public static int Height = 7500;
        public static Emgu.CV.Image<Rgb, float>[] Images;

        static void Main(string[] args)
        {
            int N = 50;
            int Threads = 5;

            Images = new Emgu.CV.Image<Rgb, float>[N];
            Console.WriteLine("Start");

            ParallelOptions po = new ParallelOptions();
            po.MaxDegreeOfParallelism = Threads;
            System.Threading.Tasks.Parallel.For(0, N, po, new Action<int>((i) =>
            {
                Images[i] = GetRandomImage();
                Console.WriteLine("Prossing image: " + i);
                Images[i].SmoothBilatral(15, 50, 50);
                GC.Collect();
            }));
            Console.WriteLine("End");
        }

        public static Emgu.CV.Image<Rgb, float> GetRandomImage()
        {
            Emgu.CV.Image<Rgb, float> im = new Emgu.CV.Image<Rgb, float>(Width, Height);

            float[, ,] d = im.Data;
            Random r = new Random((int)DateTime.Now.Ticks);

            for (int y = 0; y < Height; y++)
            {
                for (int x = 0; x < Width; x++)
                {
                    d[y, x, 0] = (float)r.Next(255);
                    d[y, x, 1] = (float)r.Next(255);
                    d[y, x, 2] = (float)r.Next(255);
                }
            }
            return im;
        }

    }

The code is simple. Allocate an array of images. Generate a random image and populate it with random numbers. Execute bilateral filter over the image. That's it.

If I execute this program in a single thread, (Threads=1) everything seems to work normally with no problem. But, if I raise the number of concurrent threads to 5 I get an AccessViolationException very quickly.

I've went over OpenCV code and verified that there are no allocations on the OpenCV side and also went over the EMGU code searching for un-pinned objects or other problems and everything seems correct.

Some notes:

  1. If you remove the GC.Collect() you will get the AccessViolationException less often but it will eventually happen.
  2. This happens only when executed in Release mode. In Debug mode I didn't experience any exceptions.
  3. Although each Image is 675MB there is no problem with allocation (I have ALLOT of memory) and a 'OutOfMemoryException' is thrown in case the system ran out of memory.
  4. I used bilateral filter but I get this exception with other filters/functions as well.

Any help would be appreciated. I've been trying to fix this for more than a week.

i7 (no overclock), Win7 64bit, 32GB RAM, VS 2010, Framework 4.0, OpenCV 2.4.3

Stack:

Start
Prossing image: 20
Prossing image: 30
Prossing image: 40
Prossing image: 0
Prossing image: 10
Prossing image: 21

Unhandled Exception: System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
   at Emgu.CV.CvInvoke.cvSmooth(IntPtr src, IntPtr dst, SMOOTH_TYPE type, Int32 param1, Int32 param2, Double param3, Double param4)
   at TestMemoryViolationCrash.AVE_Simulation.<Main>b__0(Int32 i) in C:\branches\1.1\TestMemoryViolationCrash\Program.cs:line 32
   at System.Threading.Tasks.Parallel.<>c__DisplayClassf`1.<ForWorker>b__c()
   at System.Threading.Tasks.Task.InnerInvokeWithArg(Task childTask)
   at System.Threading.Tasks.Task.<>c__DisplayClass10.<ExecuteSelfReplicating>b__f(Object param0)
   at System.Threading.Tasks.Task.Execute()
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot)
   at System.Threading.Tasks.Task.ExecuteEntry(Boolean bPreventDoubleExecution)
   at System.Threading.Tasks.ThreadPoolTaskScheduler.TryExecuteTaskInline(Task task, Boolean taskWasPreviouslyQueued)
   at System.Threading.Tasks.TaskScheduler.TryRunInline(Task task, Boolean taskWasPreviouslyQueued)
   at System.Threading.Tasks.Task.InternalRunSynchronously(TaskScheduler scheduler, Boolean waitForCompletion)
   at System.Threading.Tasks.Parallel.ForWorker[TLocal](Int32 fromInclusive, Int32 toExclusive, ParallelOptions parallelOptions, Action`1 body, Action`2 bodyWithState, Func`4 bodyWithLocal, Func`1 loc
alInit, Action`1 localFinally)
   at System.Threading.Tasks.Parallel.For(Int32 fromInclusive, Int32 toExclusive, ParallelOptions parallelOptions, Action`1 body)
   at TestMemoryViolationCrash.AVE_Simulation.Main(String[] args) in C:\branches\1.1\TestMemoryViolationCrash\Program.cs:line 35

Unhandled Exception: System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
   at Emgu.CV.CvInvoke.cvSmooth(IntPtr src, IntPtr dst, SMOOTH_TYPE type, Int32 param1, Int32 param2, Double param3, Double param4)
   at TestMemoryViolationCrash.AVE_Simulation.<Main>b__0(Int32 i) in C:\branches\1.1\TestMemoryViolationCrash\Program.cs:line 32
   at System.Threading.Tasks.Parallel.<>c__DisplayClassf`1.<ForWorker>b__c()
   at System.Threading.Tasks.Task.InnerInvokeWithArg(Task childTask)
   at System.Threading.Tasks.Task.<>c__DisplayClass10.<ExecuteSelfReplicating>b__f(Object param0)
   at System.Threading.Tasks.Task.Execute()
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot)
   at System.Threading.Tasks.Task.ExecuteEntry(Boolean bPreventDoubleExecution)
   at System.Threading.ThreadPoolWorkQueue.Dispatch()

Unhandled Exception: System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
   at Emgu.CV.CvInvoke.cvSmooth(IntPtr src, IntPtr dst, SMOOTH_TYPE type, Int32 param1, Int32 param2, Double param3, Double param4)
   at TestMemoryViolationCrash.AVE_Simulation.<Main>b__0(Int32 i) in C:\branches\1.1\TestMemoryViolationCrash\Program.cs:line 32
   at System.Threading.Tasks.Parallel.<>c__DisplayClassf`1.<ForWorker>b__c()
   at System.Threading.Tasks.Task.InnerInvokeWithArg(Task childTask)
   at System.Threading.Tasks.Task.<>c__DisplayClass10.<ExecuteSelfReplicating>b__f(Object param0)
   at System.Threading.Tasks.Task.Execute()
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot)
   at System.Threading.Tasks.Task.ExecuteEntry(Boolean bPreventDoubleExecution)
   at System.Threading.ThreadPoolWorkQueue.Dispatch()
Press any key to continue . . .
Gilad
  • 2,876
  • 5
  • 29
  • 40
  • Post the stack information when the exception happens, – Security Hound Jan 22 '13 at 16:11
  • The exception is pretty clear. You are trying to access memory that your program doesn't have access to ( because it didn't fill it ). – Security Hound Jan 22 '13 at 16:30
  • 6
    I don't see what is so clear. Why is there a crash in Release mode and not in Debug mode? Why is this working properly if executed in a single thread and crashes with multiple threads. Please en-light me. – Gilad Jan 22 '13 at 16:32
  • 1
    @Gilad I've usually come across this error when either a disposed image was attempted to be read or when multiple threads shared the same image pointer. Your code seems straight though. – Asti Jan 22 '13 at 16:42
  • I don't know the Emgu library, but the documentation tells me that Image implements `IDisposable`. I suggest you try wrapping the handling of Images in a using block. – Brian Rasmussen Jan 22 '13 at 16:53
  • @Gilad - It doesn't make sense it only crashes in Release mode. So the only logical explaination is your code is the problem and your simply lucky its working otherwise. – Security Hound Jan 22 '13 at 17:31
  • I don't have a complete answer to all your questions, but I'm guessing this is because when you use multiple threads they all try to use a single memory location. Also, I have a feeling Debug mode uses runs differently from Release mode (in a manner that doesn't lead to the multithreading problem). – Chibueze Opata Jan 24 '13 at 20:48
  • 2
    AccessViolationException in a piece of code named XXXX is (99% chance) a bug in XXXX. You should contact the support for Emgu.CV. It's probably because you do something it doesn't like, maybe you can find a workaround, but it's still a bug in Emgu.CV. – Simon Mourier Jan 27 '13 at 07:50
  • @Simon, I went over the EMGU code myself and everything there looks fine, all the objects are pinned correctly. I doubt that it is an issue in EMGU. – Gilad Jan 27 '13 at 08:18
  • Well, I should have said it's a bug in Emgu.CV or lower layers. Since Emgu.CV is more a gateway to OpenCV, I suppose it may be a bug in OpenCV. AccessViolationException is the typical .NET error equivalent when a GPF (http://en.wikipedia.org/wiki/General_protection_fault) happens in native code under windows. This would not be unusual in multithreading environment (for example http://code.opencv.org/issues/2059). – Simon Mourier Jan 27 '13 at 10:37
  • @Simon, I don't know... I went over all the relevant code including OpenCV. Couldn't find any initialization of new memory or other memory issues. I saw the case you have found and I don't think this is a similar case... – Gilad Jan 27 '13 at 11:08
  • I know the case is not related, it's just an example that shows OpenCV is not bug free, especially in multithreading environments (and even the latest versions). – Simon Mourier Jan 27 '13 at 11:13

2 Answers2

13

Your example doesn't keep a reference to the result image from Image.SmoothBilatral. The input images are rooted in a static array so are fine.

An Emgu.CV Image's data array is pinned to a GCHandle inside the actual image, this is no different from the fact that image contains the array and doesn't prevent collection while the GCHandle's pointer is in use by unmanaged code (in the abscence of a managed root to the image).

Because the Image.SmoothBilatral method doesn't do anything with its temporary result image other than pass its pointer and return it, I think it gets optimised away to the extent that the result image can be collected while the smooth is processing.

Because there's no finalizer for this class, opencv will not get called upon to release it's unmanaged image header (which has a pointer to the managed image data) so opencv still thinks it has a usable image structure.

You can fix it by taking a reference to the result of SmoothBilatral and doing something with it (like disposing it).

This extension method would also work (i.e. allow it to be called successfuly for benchmarking without the result being used):

public static class BilateralExtensionFix
{
    public static Emgu.CV.Image<testchannels, testtype> SmoothBilateral(this Emgu.CV.Image<testchannels, testtype> image, int p1, int p2 , int p3)
    {
        var result = image.CopyBlank();
        var handle = GCHandle.Alloc(result);
        Emgu.CV.CvInvoke.cvSmooth(image.Ptr, result.Ptr, Emgu.CV.CvEnum.SMOOTH_TYPE.CV_BILATERAL, p1, p1, p2, p3);
        handle.Free();
        return result;
    }
}

I think what EmguCV should be doing is only pinning pointers to pass to opencv while making an interop call.

p.s The OpenCv bilateral filter crashes (producing a very similar error to your problem) on any kind of float image passed with zero variation (min() = max()) across all channels. I think because of how it builds it's binned exp() lookup table.

This can be reproduced with:

    // create new blank image
    var zeroesF1 = new Emgu.CV.Image<Rgb, float>(75, 75);
    // uncomment next line for failure
    zeroesF1.Data[0, 0, 0] += 1.2037063600E-035f;
    zeroesF1.SmoothBilatral(15, 50, 50);

This was confusing me as I was actually sometimes getting this error due to a bug in my test code...

Peter Wishart
  • 11,600
  • 1
  • 26
  • 45
  • 2
    Thanks Gilad. An interesting question... taught me a lot that I thought I knew already. – Peter Wishart Jan 31 '13 at 10:04
  • @PeterWishart Can you please take a look at this https://stackoverflow.com/questions/50388992/emgucv-out-of-memory-expcetion-in-x86-release-mode-only-sharpening-images – techno May 18 '18 at 05:08
3

What version of Emgu CV are you using? I couldn't find a 2.4.3 version of it.

Pretty sure your code is not the problem.

Seems possible that the Emgu.CV.Image constructor might have a concurrency issue (either in the managed wrapper or the unmanaged code). The way the managed data array is handled in the Emgu CV trunk seems solid, there is some unmanaged data allocated during the image constructor which I suppose might have gone wrong.

What happens if you try:

  • Moving Images[i] = GetRandomImage(); outside of the parallel For().
  • Slapping a lock() around the Image constructor in GetRandomImage()

I noticed there's a closed bug report of someone having a similar issues (calls to image constructor occuring in parallel but images themselves not shared among threads) here.

[Edit]

Yes this is a strange one. I can reproduce with the stock 2.4.2 version and OpenCV binaries.

It only seems to crash for me if the number of threads in the parallel for exceeds the number of cores which is >2 for me.. would be interesting to know how many cores are on your test system.

Also I only get the crash when the code is not attached to the debugger and Optimize Code is enabled - have you ever observed it in release mode with the debugger attached?

As the SmoothBilateral function is CPU bound, using MaxDegreeOfParallelism more than the number of cores doesn't really add any benefit so there's a perfect workaround assuming what I found about the number if threads vs cores is also true for your rig (sods law predicts: it isn't).

So my guess is there is a concurrency/volatile issue in Emgu that only manifests when JIT optimisation is run, and when the GC is moving managed data around. But, as you say, there are no obvious unpinned-pointer-to-managed-object issues in the Emgu code.

Although I still can't explain it properly, here's what I found so far:

With the GC.Collect + console logs removed, the calls to GetRandomImage() serialised, and the code run outside of MSVC I couldn't reproduce the issue (although this may have just reduced the frequency):

            public static int Width = 750;
            public static int Height = 750;
...
                int N = 500;
                int Threads = 11;
                Images = new Emgu.CV.Image<Rgb, float>[N];
                Console.WriteLine("Start");
                ParallelOptions po = new ParallelOptions();
                po.MaxDegreeOfParallelism = Threads;
                for (int i = 0; i < N; i++)
                {
                    Images[i] = GetRandomImage();
                }
                System.Threading.Tasks.Parallel.For(0, N, po, new Action<int>((i) =>
                {
                    //Console.WriteLine("CallingSmooth");
                    Images[i].SmoothBilatral(15, 50, 50);
                    //Console.WriteLine("SmoothCompleted");
                }));
                Console.WriteLine("End");

I added a timer to fire GC.Collect outside of the parallel for, but still more often than it would fire normally:

        var t = new System.Threading.Timer((dummy) => { 
            GC.Collect(); 
        }, null, 100,100);

And with this change I still can't reproduce the issue, although GC collect is being called less consistently than in your demo as the thread pool is busy, also there are no (or very few) managed allocations occuring in the main loop for it to collect. Uncommenting the console logs around the SmoothBilatral call then repros the error fairly swiftly (by giving GC something to collect I guess).

[Another edit]

The OpenCV 2.4.2 reference manual states that cvSmooth is deprecated AND that "Median and bilateral filters work with 1- or 3-channel 8-bit images and can not process images in-place."... not very encouraging!

I find that using median filter on byte or float images and bilateral on byte images works fine, and I can't see why any CLR/GC issues woudn't affect those cases too.

So despite the strange effects on the C# test program I still reckon this is an Emgu/OpenCV bug.

If you haven't already, you should test with opencv binaries that you've compiled yourself, if it still fails convert your test to C++.

N.b. that OpenCV has its own parallelism implementation which would probably work out faster.

Peter Wishart
  • 11,600
  • 1
  • 26
  • 45
  • I used EMGU 2.4.2 and tweaked it to work with OpenCV 2.4.3 binaries. There were no breaking changes, just a couple of additional methods (and bug fixes). I traced EMGU code multiple times and it seems that it is written correctly (All the relevant objects are pinned). This is why this 'bug' is so strange. – Gilad Jan 27 '13 at 08:22
  • I tried what you have suggested. I took GetRandomImage out of the parallel loop but still AccessViolationException. There was no point adding the lock as moving the ctor outside of the parallel loop is the same as executing it synchronously. – Gilad Jan 27 '13 at 08:32
  • I also checked the 'closed bug'. It seems that his solution is moving the process code to the enumerator function. I'm more than positive that when we wrote that he found a solution he didn't notice that his code isn't running in parallel anymore (I guess it was the adrenalin rush of finding a solution)... – Gilad Jan 27 '13 at 09:29
  • Thanks for putting some time into this. My CPU is 4 cores with HT so it can run 8 images at once. The problem is obvious, even though some objects are pinned they are being moved by the GC (When GC.Collect is executed). I'm getting a very strong feeling that this is a bug in the MS framework. I also checked Framework 4.5 (VS2012) and got the same error. – Gilad Jan 29 '13 at 11:24
  • Another possibility is some kind of heap corruption from the unmanaged OpenCV code that is only detected when the GC works (actually maybe not, think that would return a different error). – Peter Wishart Jan 29 '13 at 12:14
  • Do you ever get a crash with MaxDOP of 4 instead of 5 (you've still only got 4 sets of L1/L2 cache)? I wonder if changing the method of parallelism (i.e. fire up X backgroundworkers, wait for all to complete, fire up X more) or reimplementing your demo in C++ would make a difference (the latter would rule out GC moves). – Peter Wishart Jan 29 '13 at 12:21
  • Yes, I do get a crash. The true problem is that GC.COllect is executed while a different thread is executing a function with p/Invoke. So as you can imagine, this can happen with 2 threads. – Gilad Jan 29 '13 at 13:02
  • Bilateral filter was taken out of the smooth function and given a function of its own, see p.225. It can receive float image and this is the method used by EMGU. – Gilad Jan 30 '13 at 08:56
  • On the latter point, that's not what I read in the 2.4.2 EMGU source: http://sourceforge.net/p/emgucv/code/ci/0e7f54c4e2adaa0fadfe72b0f4b378adbeff3002/tree/Emgu.CV/Image.cs#l3886 http://sourceforge.net/p/emgucv/code/ci/0e7f54c4e2adaa0fadfe72b0f4b378adbeff3002/tree/Emgu.CV/PInvoke/CvInvokeImgproc.cs#l359 – Peter Wishart Jan 30 '13 at 09:56