4

I have the following image:

enter image description here

I want to extract the boxed diagrams as so:

enter image description here

enter image description here

Here's what I've attempted:

import cv2
import matplotlib.pyplot as plt

# Load the image
image = cv2.imread('diagram.jpg')

# Convert to grayscale
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

# Apply thresholding to create a binary image
_, thresh = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY_INV)

# Find contours
contours, hierarchy = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

# Draw the contours
cv2.drawContours(image, contours, -1, (0, 0, 255), 2)

# Show the final image
plt.imshow(image), plt.show()

However, I've realized it'll be difficult to extract the diagrams because the contours aren't closed:

enter image description here

I've tried using morphological closing to close the gaps in the box edges:

# Define a rectangular kernel for morphological closing
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5, 5))

# Perform morphological closing to close the gaps in the box edges
closed = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)

But this changes almost nothing. How should I approach this problem?

2 Answers2

3

We may replace morphological closing with dilate then erode, but filling the contours between the dilate and erode.

For filling the gaps, the kernel size should be much larger than 5x5 (I used 51x51).


Assuming the handwritten boxes are colored, we may convert from BGR to HSV, and apply the threshold on the saturation channel of HSV:

hsv = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)  # Convert from BGR to HSV color space 
gray = hsv[:, :, 1]  # Use saturation from HSV channel as "gray".
_, thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_OTSU)  # Apply automatic thresholding (use THRESH_OTSU).

Apply dilate with large kernel, and use drawContours for filling the contours:

kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (51, 51))  # Use relatively large kernel for closing the gaps   
dilated = cv2.dilate(thresh, kernel)  # Dilate with large kernel

contours, hierarchy = cv2.findContours(dilated, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cv2.drawContours(dilated, contours, -1, 255, -1)

Apply erode after filling the contours Erode after dilate is equivalent to closing, but here we are closing after filling.

closed = cv2.erode(dilated, kernel)

Code sample:

import cv2
import numpy as np

# Load the image
image = cv2.imread('diagram.png')

hsv = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)  # Convert from BGR to HSV color space 

# Convert to grayscale
#gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = hsv[:, :, 1]  # Use saturation from HSV channel as "gray".

# Apply thresholding to create a binary image
_, thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_OTSU)  # Apply automatic thresholding (use THRESH_OTSU).

thresh = np.pad(thresh, ((100, 100), (100, 100)))  # Add zero padding (required due to large dilate kernels).

# Define a rectangular kernel for morphological operations.
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (51, 51))  # Use relatively large kernel for closing the gaps

dilated = cv2.dilate(thresh, kernel)  # Dilate with large kernel

# Fill the contours, before applying erode.
contours, hierarchy = cv2.findContours(dilated, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cv2.drawContours(dilated, contours, -1, 255, -1)

closed = cv2.erode(dilated, kernel)  # Apply erode after filling the contours.

closed = closed[100:-100, 100:-100]  # Remove the padding.

# Find contours
contours, hierarchy = cv2.findContours(closed, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

# Draw the contours
cv2.drawContours(image, contours, -1, (255, 0, 0), 2)

# Show images for testing
# plt.imshow(image), plt.show()
cv2.imshow('gray', gray)
cv2.imshow('thresh', thresh)
cv2.imshow('dilated', dilated)
cv2.imshow('closed', closed)
cv2.imshow('image', image)
cv2.waitKey()
cv2.destroyAllWindows()

Result:
enter image description here

gray (saturation channel):
enter image description here

thresh:
enter image description here

dilated (after filling):
enter image description here

closed:
enter image description here

Rotem
  • 30,366
  • 4
  • 32
  • 65
2

Just need to dilate the image to make the rectangle closed, then define a threshold for the area of the contours:

import cv2

# Load the image
image = cv2.imread('diagram.jpg')

# Convert to grayscale
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)

# Apply thresholding to create a binary image
ret,thresh = cv2.threshold(gray,200,255,1)

# Need to dilate the image to make the contours closed
dilate = cv2.dilate(thresh,None)
erode = cv2.erode(dilate,None)

# Find contours
contours,hierarchy = cv2.findContours(erode,cv2.RETR_CCOMP,cv2.CHAIN_APPROX_SIMPLE)

for i,cnt in enumerate(contours):
    # Check if it is an external contour and its area is more than 8000
    if hierarchy[0,i,3] == -1 and cv2.contourArea(cnt)>8000:
        x,y,w,h = cv2.boundingRect(cnt)
        cv2.rectangle(image,(x,y),(x+w,y+h),(0,255,0),2)
        cv2.imwrite('template {0}.jpg'.format(i), image[y:y+h,x:x+w])
cv2.imshow('img',image)

You will get :

enter image description here

HMH1013
  • 1,216
  • 2
  • 13