0

I am a beginner in programming and for the first "harder" project I choose to make a tool to search for doubles in my picture collection.

My first thought was to use hashes so I came up with this:

var files = Directory.GetFiles("T:Obrazki", "*.jpg");
        foreach (var item in files)
        {
            var m = Image.FromFile(item);
            Console.WriteLine(m.GetHashCode());
        }

It starts pretty well and then gives the System.OutOfMemoryException.

I tried many things including dividing the loop into 2 for loops, but with no effect. Next I found online a piece of advice to change the Target Platform to x64 which I did and nothing helped.

The last thing I tried was to dispose of 'm' every iteration of the loop and to manually add GC.Collect:

var files = Directory.GetFiles("T:Obrazki", "*.jpg");
        foreach (var item in files)
        {
            var m = Image.FromFile(item);
            Console.WriteLine(m.GetHashCode());
            m.Dispose();
            GC.Collect();
        }

It didn't work aswell. It crashes after +/- 180 images. Do you have any ideas how to do this?

  • I don't know if this is the best way, but you could read the first x number of bytes from each file (don't treat it as an image), and compare them, only when you find matches do you increase how many bytes you grab. How many bytes should you grab to start with? I'm not sure, tweak it so it's accurate but try to keep it small – Steve's a D Aug 22 '15 at 00:14
  • Have a look at this question. I think the problem might be it. Use Directory.EnumerateFiles http://stackoverflow.com/questions/1970603/c-sharp-directory-getfiles-memory-help – SergeyAn Aug 22 '15 at 02:30
  • @SteveG It would be really slow. – Bartosz Żółkiewski Aug 22 '15 at 11:43
  • @user1551066 The problem clearly is in creating a new Bitmap in `var m = Image.FromFile(item);`. The exception is being thrown from there. – Bartosz Żółkiewski Aug 22 '15 at 11:45
  • @BartoszŻółkiewski no slower than it is now, you'll be reading less from disk and won't have to force an all generational gc collection, and even though you'll have to seek twice for a few files, your other option leads you to an OOM exception. Basically, no matter my solution, or somebody else's, you're not going to be successful loading all of the images at once. – Steve's a D Aug 22 '15 at 15:18
  • @BartoszŻółkiewski also, ".NET GC is non-deterministic (i.e. you never know nor should you depend on when it happens, which means you can never be sure when the runtime will collect old objects)" [from here](http://stackoverflow.com/a/9432317/496680), so I think you're depending on the gc too much, which is also why you're going OOM, even though you're collecting – Steve's a D Aug 22 '15 at 15:32
  • @BartoszŻółkiewski also, this might help: http://stackoverflow.com/a/653769/496680 – Steve's a D Aug 22 '15 at 15:57
  • 1
    possible duplicate of [Is there a reason Image.FromFile throws an OutOfMemoryException for an invalid image format?](http://stackoverflow.com/questions/2610416/is-there-a-reason-image-fromfile-throws-an-outofmemoryexception-for-an-invalid-i) – HugoRune Aug 24 '15 at 00:16

1 Answers1

0

Try to create a collection of System.Drawing.Bitmap objects from each file name in your folder then compare every two objects at a time by using GetPixel(int x, int y) method where x and y are the coordinates of the pixel to retrieve. This way will help you find doubles. This small article may match your expectation:

Image Comparison using C#

If you are working with multiple image files, consider to release bitmap resources after use.

Quan Nguyen
  • 562
  • 1
  • 5
  • 20
  • Unfortunately it still throws OutOfMemory while creating a collection, I think it's something to do with the fact that I am loading that many images. I tried resizing them to 16x16 but it doesn't make it any different. Crashes after the same amount of loads. – Bartosz Żółkiewski Aug 22 '15 at 11:41