1

I have an image which I'm using skimage to try and detect:

  1. how many images are actually within the page - I'm expecting it to 'count' 5

  2. find the corners of each image - so if it counts 5 above our maxCorners should be 5*4=20

  3. draw straight lines between each corner and 'mask' each of the 5 images

right now all I got is the image being read, doing a fill holes and thats about it - guidance on the rest?

from scipy import ndimage as nd
import imageio
from skimage import io, filters
from scipy import ndimage
import matplotlib.pyplot as plt

filename = "C:\\Users\\Tony\\Pictures\\img807.tif"

im = io.imread(filename, as_gray=True)
val = filters.threshold_otsu(im)
drops = ndimage.binary_fill_holes(im < val)
plt.imshow(drops, cmap='gray')
plt.show()

I've tried looking at these resources:

Specifically the last one corner detection...

enter image description here

Here's the original image: enter image description here

True source link (12hrs): https://u.pcloud.link/publink/show?code=XZDONYXZVkUNFT2qcEFk4nFYYnx7d8swzaD7

Tony
  • 8,681
  • 7
  • 36
  • 55

1 Answers1

1

This Answer was the key to solve this problem.

Result Result2

Coords:

[[(38, 11), (251, 364)], [(254, 62), (592, 266)], [(254, 312),

(592, 518)], [(46, 456), (247, 797)], [(346, 557), (526, 797)]]

import numpy as np
import matplotlib.pyplot as plt
import cv2
import itertools

#====================================================
img = cv2.imread('input.jpg', 0)
blur = cv2.blur(img,(3,3))

blur[blur>225] = 0

sobelx = cv2.Sobel(blur,cv2.CV_64F,1,0,ksize=5)
sobely = cv2.Sobel(blur,cv2.CV_64F,0,1,ksize=5)
sobel = np.sqrt( sobelx**2 + sobely**2)

sobel = (255 * sobel)/(sobel.max() - sobel.min())
sobel = sobel.astype(np.uint8)
sobel[sobel<20] = 0
sobel[sobel>20] = 255
#====================================================

_,thresh = cv2.threshold(blur,127,255,1)
thresh = thresh + sobel

median = cv2.medianBlur(thresh,3)

gray_scale = median.copy()

image = np.stack([img, img, img], axis=2)

img_bin = cv2.Canny(gray_scale,50,110)
dil_kernel = np.ones((3,3), np.uint8)
img_bin=cv2.dilate(img_bin,dil_kernel,iterations=1)

line_min_width = 7

kernal_h = np.ones((2,line_min_width), np.uint8)
img_bin_h = cv2.morphologyEx(img_bin, cv2.MORPH_OPEN, kernal_h)

kernal_v = np.ones((line_min_width,1), np.uint8)
img_bin_v = cv2.morphologyEx(img_bin, cv2.MORPH_OPEN, kernal_v)

img_bin_final=img_bin_h|img_bin_v
final_kernel = np.ones((3,3), np.uint8)
img_bin_final=cv2.dilate(img_bin_final,final_kernel,iterations=1)

_, _, stats, _ = cv2.connectedComponentsWithStats(~img_bin_final, connectivity=8, ltype=cv2.CV_32S)

coords = []
### 1 and 0 and the background and residue connected components whihc we do not require
for x,y,w,h,area in stats[2:]:
    if area>15000:
        coords.append([(x,y),(x+w,y+h)])

def bb_intersection(coords, boxA, boxB):
    # determine the (x, y)-coordinates of the intersection rectangle
    xA = max(boxA[0][0], boxB[0][0])
    yA = max(boxA[0][1], boxB[0][1])
    xB = min(boxA[1][0], boxB[1][0])
    yB = min(boxA[1][1], boxB[1][1])
    # compute the area of intersection rectangle
    interArea = max(0, xB - xA + 1) * max(0, yB - yA + 1)
    # compute the area of both rectangles
    boxAArea = (boxA[1][0] - boxA[0][0] + 1) * (boxA[1][1] - boxA[0][1] + 1)
    boxBArea = (boxB[1][0] - boxB[0][0] + 1) * (boxB[1][1] - boxB[0][1] + 1)

    if(interArea == boxAArea):
        coords.remove(boxA)
    elif(interArea == boxBArea):
        coords.remove(boxB)
#
for boxa, boxb in itertools.combinations(coords, 2):
    bb_intersection(coords, boxa, boxb)

for coord in coords:
    cv2.rectangle(image,coord[0],coord[1],(0,255,0),1)

print(coords)

plt.imshow(image)
plt.title("There are {} images".format(len(coords)))
plt.axis('off')
plt.show()

Edit:

This Answer isn't a general solution, the parameters has to be tuned accordingly, for the original image change this block, and the code will work perfectly:

img = cv2.imread('input.tif', 0)

img = cv2.resize(img, (605, 830))

blur = img.copy()

blur[blur>225] = 0

sobelx = cv2.Sobel(blur,cv2.CV_64F,1,0,ksize=3)
sobely = cv2.Sobel(blur,cv2.CV_64F,0,1,ksize=3)
sobel = np.sqrt( sobelx**2 + sobely**2)

sobel = (255 * sobel)/(sobel.max() - sobel.min())
sobel = sobel.astype(np.uint8)
sobel[sobel<40] = 0
sobel[sobel>40] = 255

Coords:

[[(31, 16), (240, 364)], [(253, 56), (600, 265)], [(254, 309),

(605, 520)], [(40, 456), (248, 803)], [(347, 557), (534, 803)]]

Bilal
  • 3,191
  • 4
  • 21
  • 49
  • strange - I only get 2 images and no bounding boxes around them... [[(2316, 494), (5579, 2500)], [(3190, 5163), (4967, 7551)]] – Tony Jan 15 '21 at 22:59
  • @Tony I used this code with the image you provided and I have got these results, the screenshot is from the output of the `Matplotlib`, are you running the same code on this [image](https://i.stack.imgur.com/OPkTg.jpg) ? – Bilal Jan 15 '21 at 23:05
  • actually no its not the true source but curious why it doesn't works on the representation of the source but not the actual thing (updated main post with link to true source) – Tony Jan 15 '21 at 23:09
  • @Tony please see the updated answer, if you want to map the coordinates to your original image, you have to multiply the coordinates with the resize ratio. – Bilal Jan 15 '21 at 23:47
  • Thanks! confirmed its working now - How do you know what to tune the numbers to? ie if I switch images ? – Tony Jan 15 '21 at 23:51
  • @Tony it is empirical tuning, I don't have exact formula for doing that. – Bilal Jan 15 '21 at 23:54
  • Its strange to me that it doesn't work without the resize... how do I get it working without having to resize the original? – Tony Jan 15 '21 at 23:59
  • @Tony simply you have to increase the size of the used kernels which requires much computations. – Bilal Jan 16 '21 at 00:06
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/227395/discussion-between-tony-and-bilal). – Tony Jan 16 '21 at 00:12
  • very interesting stuff... so if I now wanted to mask the original image based on the 5 found ? - I assume there's some math to scale up what was found to the original? – Tony Jan 16 '21 at 01:31
  • @Tony you divide the original shape by the resized shape, then you multiply this ratio for `x, y` by the `coords` and don't forget to use the `floor` function in order to make the results integers. – Bilal Jan 16 '21 at 07:49
  • qq - why get the square root of the sobel? this line `sobel = np.sqrt(sobelx ** 2 + sobely ** 2)` – Tony Jan 18 '21 at 21:33
  • @Tony the magnitude of sobel filter is the sqrt of the x component squared and the y component squared. – Bilal Jan 19 '21 at 03:37
  • Hi @bilal can you help with the proper settings for this image? I've been playing with all the numbers but cant seem to get it right. https://u.pcloud.link/publink/show?code=XZqR0bXZ5AV0g52XgCYfMIWyAaQXAuNcmA6V ? scaled to 10% of the original in the link – Tony Jan 27 '21 at 00:25
  • @Tony This algorithm isn't general, it can't work with all images whatever they are. – Bilal Jan 27 '21 at 17:37
  • @Tony this algorithm isn't general, for the new image if you want to split it, you can do that easily, just see the row of which splits the images, and split the array directly, no need to this algorithm which detects lines in the images that defines the borders mainly. – Bilal Jan 28 '21 at 21:02
  • how can I visually see which row would split the images properly? – Tony Jan 28 '21 at 21:08
  • @Tony `plt.imshow(img)` >>> Zoom , `# 2747, 5630` are the rows which define the borders between your 3 images, `image1=img[:2747,:]`, `image2=img[2747:5630,:]`, and `image3=img[5630:,:]` will split your images. – Bilal Jan 28 '21 at 21:22