CV - Extract differences between two images

Question

I am currently working on an intrusion system based on video surveillance. In order to complete this task, I take a snapshot of the background of my scene (assume it's totally clean, no people or moving objects). Then, I compare the frame I get from the (static) video camera and look for the differences. I have to be able to check any differences, not only human shape or whatever, so I cannot specific feature extraction.

Typically, I have:

I am using OpenCV, so to compare I basically do:

cv::Mat bg_frame;
cv::Mat cam_frame;
cv::Mat motion;

cv::absdiff(bg_frame, cam_frame, motion);
cv::threshold(motion, motion, 80, 255, cv::THRESH_BINARY);
cv::erode(motion, motion, cv::getStructuringElement(cv::MORPH_RECT, cv::Size(3,3)));

Here is the result:

As you can see, the arm is stripped (due to color differential conflict I guess) and this is sadly not what I want.

I thought about add the use of cv::Canny() in order to detect the edges and fill the missing part of the arm, but sadly (once again), it only solves the problem in few situation not most of them.

Is there any algorithm or technique I could use to obtain an accurate difference report?

PS: Sorry for the images. Due to my newly subscription, I do not have enough reputation.

EDIT I use grayscale image in here, but I am open to any solution.

maybe try [not to reinvent the wheel](http://docs.opencv.org/modules/video/doc/motion_analysis_and_object_tracking.html#backgroundsubtractor) — berak, Nov 20 '14 at 09:15
just have a look, what's already builtin (and if it works better), yes. — berak, Nov 20 '14 at 09:19
did you convert to grayscale? If you dont want to use the openCV classes: Try computing difference of each channel and combine them, try difference of HSV images. If you want to use existing techniques, try ViBe. Keep in mind that even lighting changes will be "any differences, not only human shape or whatever", which might be a problem for very most of all background subtraction methods. — Micka, Nov 20 '14 at 09:22
in general, building a background model over time, using many images, will beat any one-off approach. also, [Vibe](http://www2.ulg.ac.be/telecom/publi/publications/barnich/Barnich2011ViBe/index.html#toc-Section--2) <-- just be aware that it is patented. — berak, Nov 20 '14 at 09:31
well, the question here was not to build the background model, but to find the "differences" better (which is hard in grayscale images). @ValentinTrinqué can you please add the original single images, I've implemented a multi-channel version of your code, but I dont have access to the original images. Tried to crop your double-image, but they look a bit translated... — Micka, Nov 20 '14 at 09:39

Micka · Accepted Answer · 2014-11-20T10:16:47.210

One problem in your code is cv::threshold which only uses 1 channel images. Finding the pixelwise "difference" between two images in only grayscale often leads to unintuitive results.

Since your provided images are a bit translated or the camera wasnt stationary, I've manipulated your background image to add some foreground:

background image:

enter image description here

foreground image:

enter image description here

code:

    cv::Mat diffImage;
    cv::absdiff(backgroundImage, currentImage, diffImage);

    cv::Mat foregroundMask = cv::Mat::zeros(diffImage.rows, diffImage.cols, CV_8UC1);

    float threshold = 30.0f;
    float dist;

    for(int j=0; j<diffImage.rows; ++j)
        for(int i=0; i<diffImage.cols; ++i)
        {
            cv::Vec3b pix = diffImage.at<cv::Vec3b>(j,i);

            dist = (pix[0]*pix[0] + pix[1]*pix[1] + pix[2]*pix[2]);
            dist = sqrt(dist);

            if(dist>threshold)
            {
                foregroundMask.at<unsigned char>(j,i) = 255;
            }
        }

giving this result:

enter image description here

with this difference image:

enter image description here

in general it is hard to compute a complete foreground/background segmentation from pixel-wise difference interpretations.

You will probably have to add postprocessing stuff to get a real segmentation, where you start from your foreground mask. Not sure whether there are any stable universal solutions yet.

As berak mentioned, in practice it won't be enough to use a single background image, so you will have to compute/manage your background image over time. There are plenty of papers covering this topic and afaik no stable universal solution yet.

here are some more tests. I converted to HSV color space: cv::cvtColor(backgroundImage, HSVbackgroundImagebg, CV_BGR2HSV); cv::cvtColor(currentImage, HSV_currentImage, CV_BGR2HSV); and performed the same operations in this space, leading to this result:

enter image description here

after adding some noise to the input:

enter image description here

I get this result:

enter image description here

so maybe the threshold is a bit too high. I still encourage you to have a look at HSV color space too, but you might have to reinterpret the "difference image" and rescale each channel to combine their difference values.

@Gizmo sorry, I'm currently not working in that field. Nice background modelling methods are ViBe, mixture of gaussians or simple frame differences ((currentFrame - lastFrame) & (nextFrame - currentFrame)), but maybe there are newer and better state of the art methods in literature. — Micka, Jul 12 '16 at 20:45

score 31 · Answer 2 · edited May 24 '23 at 23:17

I use Python, this is my result:

The code:

# 2017.12.22 15:48:03 CST
# 2017.12.22 16:00:14 CST
import cv2
import numpy as np

img1 = cv2.imread("img1.png")
img2 = cv2.imread("img2.png")
diff = cv2.absdiff(img1, img2)
mask = cv2.cvtColor(diff, cv2.COLOR_BGR2GRAY)

th = 1
imask =  mask>th

canvas = np.zeros_like(img2, np.uint8)
canvas[imask] = img2[imask]

cv2.imwrite("result.png", canvas)

Update, here is C++ code:

//! 2017.12.22 17:05:18 CST
//! 2017.12.22 17:22:32 CST

#include <opencv2/opencv.hpp>
#include <iostream>
using namespace std;
using namespace cv;
int main() {

    Mat img1 = imread("img3_1.png");
    Mat img2 = imread("img3_2.png");

    // calc the difference
    Mat diff;
    absdiff(img1, img2, diff);

    // Get the mask if difference greater than th
    int th = 10;  // 0
    Mat mask(img1.size(), CV_8UC1);
    for(int j=0; j<diff.rows; ++j) {
        for(int i=0; i<diff.cols; ++i){
            cv::Vec3b pix = diff.at<cv::Vec3b>(j,i);
            int val = (pix[0] + pix[1] + pix[2]);
            if(val>th){
                mask.at<unsigned char>(j,i) = 255;
            }
        }
    }

    // get the foreground
    Mat res;
    bitwise_and(img2, img2, res, mask);

    // display
    imshow("res", res);
    waitKey();
    return 0;
}

Similar answers:

The Python code above is a neat solution. If it is a security application, then it's only my opinion but I'm guessing you wont be able to take a single snap shot and then use that as the reference. Lighting conditions and things moving (even a fraction) could wipe out your image. As a suggestion wouldn't it be better to continually take snap shots from your video feed at a timed interval (say 1 second) then compare the last three or four looking for differences. So if only three frames are identical then fourth must have something extra in it. — perfo, Jun 13 '18 at 16:15
As we all know, the Python is a good choice for the fast prototype design. Though the details differ from different languages such as C/C++/Python, the algorithms(or main idea) is the same. Of course it is a toy code, just wrote for this question, specify for the two images. But it's possible to extend for the video if one do more test-and-trail, such as `threshold the sliding-subtractions with morph-ops`. Notice, no two images are identical, for there exists white noise. — Kinght 金, Jun 13 '18 at 19:40

score 6 · Answer 3 · edited May 24 '23 at 23:17

Another technique to obtain the exact pixel differences between two images is to use the Structural Similarity Index (SSIM) first introduced in the paper Image Quality Assessment: From Error Visibility to Structural Similarity. This method can be used to determine if two images are identical and/or showcase differences due to tiny image discrepancies. SSIM is already implemented in the scikit-image library for image processing as skimage.metrics.structural_similarity()

The structural_similarity() function returns a score and a difference image, diff. The score represents the mean structural similarity index between the two input images and can fall between the range [-1,1] with values closer to one representing higher similarity. But since you're only interested in where the two images differ, the diff image is what we'll focus on. Specifically, the diff image contains the actual image differences with darker regions having more disparity. Larger areas of disparity are highlighted in black while smaller differences are in gray.

Using these two input images

We get this result

The SSIM score after comparing the two images show that they are very similar

Image Similarity: 95.8648%

from skimage.metrics import structural_similarity
import cv2

# Load images as grayscale
image1 = cv2.imread('1.png', 0)
image2 = cv2.imread('2.png', 0)

# Compute SSIM between the two images
(score, diff) = structural_similarity(image1, image2, full=True)

# The diff image contains the actual image differences between the two images
# and is represented as a floating point data type in the range [0,1] 
# so we must convert the array to 8-bit unsigned integers in the range
# [0,255] image1 we can use it with OpenCV
diff = (diff * 255).astype("uint8")
print("Image Similarity: {:.4f}%".format(score * 100))

cv2.imshow('diff', diff)
cv2.waitKey()

can you update your answer as compare_ssim function is removed in skimage v0.18 — user889030, Dec 10 '19 at 07:19
@user889030 thanks updated. Do you know if it was removed permanently, or moved to some other module? — nathancy, Dec 10 '19 at 20:20
Looks like its moved to [`skimage.metrics.structural_similarity`](https://scikit-image.org/docs/dev/api/skimage.metrics.html#structural-similarity) in the latest version — nathancy, Dec 11 '19 at 20:29

score 1 · Answer 4 · answered Jun 16 '19 at 22:45

This is well-known classic computer vision problem called background subtraction. There are many approaches which can be used to solve this problem, most of them are already implemented, so I think you should first take a look at multiple existing algorithms, here is opensource implementation of most of them: https://github.com/andrewssobral/bgslibrary (I personally found SUBSENSE giving best results, but its deadly slow)

CV - Extract differences between two images

4 Answers4

Linked