25

I was reading through the paper : Ferrari et al. in the "Affinity Measures" section. I understood that Ferrari et al. tries to obtain affinity by :

  1. Location affinity - using area of intersection-over-union between two detections
  2. Appearance affinity - using Euclidean distances between Histograms
  3. KLT point affinity measure

However, I have 2 main problems:

  1. I cannot understand what is actually meant by intersection-over-union between 2 detections and how to calculate it
  2. I tried a slightly difference appearance affinity measure. I transformed the RGB detection into HSV..concatenating the Hue and Saturation into 1 vector, and used it to compare with other detections. However, using this technique failed as a detection of a bag had a better similarity score than a detection of the same person's head (with a different orientation).

Any suggestions or solutions to my problems described above? Thank you and your help is very much appreciated.

Dima
  • 38,860
  • 14
  • 75
  • 115
Sambas23
  • 683
  • 1
  • 13
  • 27

4 Answers4

44

Try intersection over Union

Intersection over Union is an evaluation metric used to measure the accuracy of an object detector on a particular dataset.

More formally, in order to apply Intersection over Union to evaluate an (arbitrary) object detector we need:

  1. The ground-truth bounding boxes (i.e., the hand labeled bounding boxes from the testing set that specify where in the image our object is).
  2. The predicted bounding boxes from our model.

Below I have included a visual example of a ground-truth bounding box versus a predicted bounding box:

enter image description here

The predicted bounding box is drawn in red while the ground-truth (i.e., hand labeled) bounding box is drawn in green.

In the figure above we can see that our object detector has detected the presence of a stop sign in an image.

Computing Intersection over Union can therefore be determined via:

enter image description here

As long as we have these two sets of bounding boxes we can apply Intersection over Union.

Here is the Python code

# import the necessary packages
from collections import namedtuple
import numpy as np
import cv2

# define the `Detection` object
Detection = namedtuple("Detection", ["image_path", "gt", "pred"])

def bb_intersection_over_union(boxA, boxB):
    # determine the (x, y)-coordinates of the intersection rectangle
    xA = max(boxA[0], boxB[0])
    yA = max(boxA[1], boxB[1])
    xB = min(boxA[2], boxB[2])
    yB = min(boxA[3], boxB[3])

    # compute the area of intersection rectangle
    interArea = (xB - xA) * (yB - yA)

    # compute the area of both the prediction and ground-truth
    # rectangles
    boxAArea = (boxA[2] - boxA[0]) * (boxA[3] - boxA[1])
    boxBArea = (boxB[2] - boxB[0]) * (boxB[3] - boxB[1])

    # compute the intersection over union by taking the intersection
    # area and dividing it by the sum of prediction + ground-truth
    # areas - the interesection area
    iou = interArea / float(boxAArea + boxBArea - interArea)

    # return the intersection over union value
    return iou

The gt and pred are

  1. gt : The ground-truth bounding box.
  2. pred : The predicted bounding box from our model.

For more information, you can click this post

GoingMyWay
  • 16,802
  • 32
  • 96
  • 149
  • 14
    this code breaks (returns negative values) for non-overlapping rectangles. – Felix Kreuk Sep 10 '17 at 15:20
  • 2
    For boxes (0,0,10,10) and (1,1,11,11) this gives wrong results; the intersection area is 100 instead of 81. – Aleksandar Jovanovic Oct 23 '17 at 14:00
  • @AleksandarJovanovic, thank you, it is because the original code added 1 to compute the area. Is it right now? – GoingMyWay Oct 23 '17 at 15:10
  • 3
    What is the order of the coordinates for `boxA` and `boxB`? – Jash Shah May 24 '18 at 06:56
  • 2
    This code does not take into account when the rectangles are not overlapping. See this answer on how to perform this correctly: https://stackoverflow.com/questions/25349178/calculating-percentage-of-bounding-box-overlap-for-image-detector-evaluation – spurra Apr 23 '19 at 02:11
  • You removed the `+ 1` in the area calculations as seen in the original source. Why? – X_Trust Jun 26 '19 at 20:25
28

1) You have two overlapping bounding boxes. You compute the intersection of the boxes, which is the area of the overlap. You compute the union of the overlapping boxes, which is the sum of the areas of the entire boxes minus the area of the overlap. Then you divide the intersection by the union. There is a function for that in the Computer Vision System Toolbox called bboxOverlapRatio.

2) Generally, you don't want to concatenate the color channels. What you want instead, is a 3D histogram, where the dimensions are H, S, and V.

Dima
  • 38,860
  • 14
  • 75
  • 115
  • Thank you very much Dima. Your assistance is greatly appreciate. I wish that I can upvote you answers. So for question 2). if I understood you correctly, I will create a 3D histogram and then find the difference between two 3D histograms to check similarity. Should this actually differ from the fact that I simply concatenate them? The values should be the same no? Thanks – Sambas23 Feb 25 '15 at 23:41
3

The current answer already explained the question clearly. So here I provide a bit better version of IoU with Python that doesn't break when two bounding boxes don't intersect.

import numpy as np

def IoU(box1: np.ndarray, box2: np.ndarray):
    """
    calculate intersection over union cover percent
    :param box1: box1 with shape (N,4) or (N,2,2) or (2,2) or (4,). first shape is preferred
    :param box2: box2 with shape (N,4) or (N,2,2) or (2,2) or (4,). first shape is preferred
    :return: IoU ratio if intersect, else 0
    """
    # first unify all boxes to shape (N,4)
    if box1.shape[-1] == 2 or len(box1.shape) == 1:
        box1 = box1.reshape(1, 4) if len(box1.shape) <= 2 else box1.reshape(box1.shape[0], 4)
    if box2.shape[-1] == 2 or len(box2.shape) == 1:
        box2 = box2.reshape(1, 4) if len(box2.shape) <= 2 else box2.reshape(box2.shape[0], 4)
    point_num = max(box1.shape[0], box2.shape[0])
    b1p1, b1p2, b2p1, b2p2 = box1[:, :2], box1[:, 2:], box2[:, :2], box2[:, 2:]

    # mask that eliminates non-intersecting matrices
    base_mat = np.ones(shape=(point_num,))
    base_mat *= np.all(np.greater(b1p2 - b2p1, 0), axis=1)
    base_mat *= np.all(np.greater(b2p2 - b1p1, 0), axis=1)

    # I area
    intersect_area = np.prod(np.minimum(b2p2, b1p2) - np.maximum(b1p1, b2p1), axis=1)
    # U area
    union_area = np.prod(b1p2 - b1p1, axis=1) + np.prod(b2p2 - b2p1, axis=1) - intersect_area
    # IoU
    intersect_ratio = intersect_area / union_area

    return base_mat * intersect_ratio
S. Sean
  • 31
  • 4
0

Here's yet another solution I implemented that works for me.

Borrowed heavily from PyImageSearch

import numpy as np

def bbox_intersects(bbox_a, bbox_b):
    if bbox_b['x0'] >= bbox_a['x0'] and bbox_b['x0'] <= bbox_a['x1'] and \
        bbox_b['y0'] >= bbox_a['y0'] and bbox_b['y0'] <= bbox_a['y1']:
        # top-left of b within a
        return True
    elif bbox_b['x1'] >= bbox_a['x0'] and bbox_b['x1'] <= bbox_a['x1'] and \
        bbox_b['y1'] >= bbox_a['y0'] and bbox_b['y1'] <= bbox_a['y1']:
        # bottom-right of b within a
        return True
    elif bbox_a['x0'] >= bbox_b['x0'] and bbox_a['x0'] <= bbox_b['x1'] and \
        bbox_a['y0'] >= bbox_b['y0'] and bbox_a['y0'] <= bbox_b['y1']:
        # top-left of a within b
        return True
    elif bbox_a['x1'] >= bbox_b['x0'] and bbox_a['x1'] <= bbox_b['x1'] and \
        bbox_a['y1'] >= bbox_b['y0'] and bbox_a['y1'] <= bbox_b['y1']:
        # bottom-right of a within b
        return True
    return False

def bbox_area(x0, y0, x1, y1):
    return (x1-x0) * (y1-y0)

def get_bbox_iou(bbox_a, bbox_b):
    if bbox_intersects(bbox_a, bbox_b):
        x_left = max(bbox_a['x0'], bbox_b['x0'])
        x_right = min(bbox_a['x1'], bbox_b['x1'])
        y_top = max(bbox_a['y0'], bbox_b['y0'])
        y_bottom = min(bbox_a['y1'], bbox_b['y1'])

        inter_area = bbox_area(x0 = x_left, x1 = x_right, y0 = y_top , y1 = y_bottom)
        bbox_a_area = bbox_area(**bbox_a)
        bbox_b_area = bbox_area(**bbox_b)

        return inter_area / float(bbox_a_area + bbox_b_area - inter_area)
    else:
        return 0
jho
  • 41
  • 3